Risk Stratification for In-Hospital Mortality in Alzheimer’s Disease Using Interpretable Regression and Explainable AI
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Older adults with Alzheimer’s disease (AD) face heightened risk of adverse hospital outcomes, including mortality. However, early identification of high-risk patients remains a challenge. While regression models provide interpretable associations, they may miss nonlinear interactions that machine learning can uncover. Objective: To identify key predictors of in-hospital mortality among AD patients using both survey-weighted logistic regression and explainable machine learning. Methods: We analyzed hospitalizations among AD patients aged ≥60 in the 2017 Nationwide Inpatient Sample (NIS). The outcome was in-hospital death. Predictors included demographics, hospital variables, and 15 comorbidities. Logistic regression used survey weighting to generate nationally representative inference; XGBoost incorporated NIS discharge weights as sample weights during 5-fold hospital-grouped cross-validation and used the same weights in performance evaluation. Missing-value imputation and feature scaling were performed within the cross-validation pipelines to prevent data leakage. Model performance was assessed using AUROC, AUPRC, Brier score, and log loss. Feature importance was assessed using adjusted odds ratios and SHapley Additive exPlanations (SHAP). A sensitivity analysis excluded palliative care and DNR status and was re-evaluated under the same grouped cross-validation. Results: In the full model, logistic regression achieved AUROC 0.879 and AUPRC 0.310, while XGBoost achieved AUROC 0.887 and AUPRC 0.324. Palliative care (aOR 6.19), acute respiratory failure (aOR 5.15), DNR status (aOR 2.20), and sepsis (aOR 2.26) were the strongest logistic predictors. SHAP analysis corroborated these findings and additionally emphasized dysphagia, malnutrition, and pressure ulcers. In sensitivity analysis excluding palliative care and DNR status, logistic regression performance declined (AUROC 0.806; AUPRC 0.206), while XGBoost performed similarly (AUROC 0.811; AUPRC 0.206). SHAP corroborated the dominant signals from end-of-life documentation and acute organ failure in the full model; in the restricted model (excluding DNR and palliative care), SHAP highlighted physiologic and frailty-related features (e.g., dysphagia, malnutrition, aspiration risk) that may be more actionable when end-of-life documentation is absent. Conclusion: Combining regression with explainable machine learning enables robust mortality risk stratification in hospitalized AD patients. Restricted models excluding end-of-life indicators provide actionable risk signals when such documentation is absent, while the full model may better support resource allocation and goals-of-care workflows.