Geroscience. 2025 Nov 1. doi: 10.1007/s11357-025-01951-9. Online ahead of print.

ABSTRACT

The aim of this study was to investigate, validate and apply Machine Learning (ML) algorithms to predict functional disability in elderly individuals using data from ELSI-Brazil. Furthermore, it sought to map the performance of the models and identify key multidimensional variables-encompassing sociodemographic and economic aspects, health status, behaviors, mental health and access to services-that could serve as early risk indicators and, based on the selected model, understand which characteristics favor or disfavor the screening of functional disability. Data from ELSI-Brazil (2015-2016) with 4502 participants were analyzed, after careful selection and pre-processing, which included imputing missing data, standardization and encoding via one-hot encoder. The selection of 49 predictor variables, from sociodemographic, health and behavioral domains, enabled the development of classification models. The SMOTE technique and tenfold cross-validation, associated with Bayesian optimization, were applied. The interpretability of the selected model was performed through SHAP analysis. The Ridge Classifier model showed robust performance, with a ROC-AUC of 0.785 (95% CI: 0.756-0.813), sensitivity of 0.703 and specificity of 0.723, in addition to a high negative predictive value (84.5%). SHAP analysis showed that variables such as depressive symptoms, concern about mobility and self-rated health status were decisive in classifying the functional disability risk. The results suggest that the use of ML techniques, integrated with multidimensional health data commonly collected in primary care settings, offers a promising tool for screening and early intervention in functional disability in the elderly. This approach may substantiate decision-making in clinical practice and health support policies aimed at active and healthy aging.

PMID:41175308 | DOI:10.1007/s11357-025-01951-9