Sci Rep. 2025 Oct 3;15(1):34580. doi: 10.1038/s41598-025-18053-3.
ABSTRACT
Depression among older adults is a critical public health issue, particularly when coexisting with non-communicable diseases (NCDs). In India, where population ageing and NCDs burden are rising rapidly, scalable data-driven approaches are needed to identify at-risk individuals. Using data from the Longitudinal Ageing Study in India (LASI) Wave 1 (2017-2018; N = 58,467), the study evaluated eight supervised machine learning models including random forest, decision tree, logistic regression, SVM, KNN, naïve bayes, neural network and ridge classifier, for predicting depression among older adults. Model performance was assessed using a 70/30 train-test split and stratified 10-fold cross-validation. Performance evaluation metrics included AUROC, PR-AUC, accuracy, sensitivity, specificity, F1-score and interpretability via SHAP. Random forest outperformed all other models, achieving an AUROC of 0.996 and an accuracy of 95.6% with F1- score of 0.954 demonstrating excellent discrimination and calibration. Decision tree followed closely with AUROC of 0.915, accuracy of 91.5% and F1- score of 0.908. Key predictors of depression included poor sleep, age, BMI, IADL limitations, MPCE quintile, religion, smoking, education and physical inactivity. SHAP values validated the clinical plausibility of these features. A reduced-feature model using the top 12 predictors retained high accuracy, enhancing interpretability. The findings demonstrate the utility of ML models, particularly random forest, for identifying depression risk in older adults. The integration of interpretable techniques, SHAP along with Information Gain enhances clinical relevance. These results have potential implications for scalable screening strategies and policy-driven interventions in geriatric mental health.
PMID:41044138 | DOI:10.1038/s41598-025-18053-3
Recent Comments