Machine Learning-Based Prediction of Depression Risk in Patients with Multimorbidity: A Study Using NHANES Data from 2009-2016

Kangli Ye
Han Peng
Xuan Zhang
Lefeng Wang
Jian Yang
Mengjie Hu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

OBJECTIVE Patients with multimorbidity are at a significantly higher risk of developing depression compared to those without multimorbidity, which can severely impair psychosocial functioning and decrease quality of life. However, no effective method currently exists to assess depression risk in these patients. In this study, we leveraged data from the National Health and Nutrition Examination Survey (NHANES) database and applied machine learning (ML) techniques to construct a depression risk prediction model, aiming to predict the likelihood of depression in patients with multimorbidity. METHODS Data from the 2009–2016 NHANES cycles were used, incorporating a wide range of demographic and clinical variables in the analysis. Various machine learning algorithms were evaluated for their predictive performance, and the CatBoost classifier, which demonstrated the best performance, was selected to build the prediction model. The model’s predictions were interpreted using SHapley Additive exPlanations (SHAP) values, which ranked the importance of features and visualized the contribution of variables, such as socioeconomic status, biological factors, and lifestyle habits, in predicting depression among multimorbid patients. RESULTS The predictive performance of the model was robust for both overall depression and the binary classification of moderate to severe depression. For moderate to severe depression classification, the model achieved an accuracy of 0.9541, an AUC of 0.9903, and an F1 score of 0.954. The accuracy and recall were 0.9475 and 0.9607, respectively, with a kappa value of 0.9081 and an MCC of 0.9083. SHAP plots revealed that age and socioeconomic status were among the most important predictors for both depression and moderate to severe depression classifications. CONCLUSION This study developed a machine learning model to predict depression risk in patients with multimorbidity using NHANES data. The model demonstrated excellent predictive performance, and SHAP plots highlighted key predictors such as age and socioeconomic status, which significantly influenced the prediction of depression and its severity.

Version published to 10.21203/rs.3.rs-5430116/v1 on Research Square
Dec 17, 2024

Explainable Machine Learning Models for Predicting Health-Related Quality of Life in High-Risk Cardiovascular Populations: A Comparative Analysis of SF-12 Data and Clinical Risk Stratification

This article has 7 authors:
1. Guoliang Ma
2. Xin Hong
3. Lin Zhu
4. Wenting Li
5. Zhuanzhuan Fan
6. Kun Li
7. Wenyan Wang
This article has no evaluationsLatest version Sep 30, 2025
Development and validation of a Cognitive Impairment Risk Prediction Model for Elderly Patients with Multimorbidity

This article has 10 authors:
1. Ruxu Ge
2. Xiaoqing Zhao
3. Ya zhang
4. Yuxin Jiang
5. Tongtong Guo
6. Zhiwei Dong
7. Pengru Sun
8. Haiyan Li
9. Wengui Zheng
10. Qi Jing
This article has no evaluationsLatest version Oct 8, 2025
A Machine Learning Approach to Prediction and Multimorbidity Risk Factor Identification in a low- and middle-income country

This article has 7 authors:
1. Olalekan A. Uthman
2. Matthew Hazell
3. Muhammed Mubashir Babatunde Uthman
4. Kolawole W Wahab
5. Ponnusamy Saravanan
6. Paramjit Gill
7. Andre Pascal Kengne
This article has no evaluationsLatest version Oct 15, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Explainable Machine Learning Models for Predicting Health-Related Quality of Life in High-Risk Cardiovascular Populations: A Comparative Analysis of SF-12 Data and Clinical Risk Stratification

Development and validation of a Cognitive Impairment Risk Prediction Model for Elderly Patients with Multimorbidity

A Machine Learning Approach to Prediction and Multimorbidity Risk Factor Identification in a low- and middle-income country