Explainable Machine Learning with Bayesian Hyper-Optimization for Predicting Cognitive Impairment from Longitudinal Nursing Home Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The monitoring of daily life in nursing home residents generates diverse and heterogeneous sources of information. Artificial Intelligence (AI) is increasingly used to predict a wide range of outcomes in both research and clinical practice, including mortality and cognitive impairment (CI). A key challenge is determining which information sources (IS) provide the most accurate predictions. In this work, we introduce a novel AI-based methodology that integrates Bayesian optimization, XGBoost, and explainable AI (SHAP) to predict CI in nursing home residents using 13 years of heterogeneous longitudinal data from 2,608 individuals. Our approach enables interpretable predictions of CI-related clinical scales such as the Mini-Mental State Examination (MMSE), the Global Deterioration Scale (GDS), and the Barthel Scale while revealing the relative contributions of various information sources, including clinical metrics and activity records. Our results demonstrate that this is the first framework to combine harmonized temporal modeling, Bayesian-optimized ensemble learning, and SHAP-based interpretability to evaluate the predictive relevance of heterogeneous clinical and behavioral data sources in a real-world long-term care setting. This integrated approach not only improves predictive performance for CI-related scores but also offers interpretable insights that can inform personalized care strategies.