Explainable machine learning model incorporating urinary heavy metals to predict nonalcoholic fatty liver disease

Xiaoqian Wang
Mei Xue
Hannah Chang
Bochun Wang
Wenquan Niu
Chung-Chou H. Chang
Xiaoqun Dong

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objectives: This study aimed to develop and validate an explainable machine learning (ML) model to predict NAFLD based on urinary heavy metals and phenotypic indices. Methods: Data were drawn from the NHANES 2017-2020. NAFLD was defined as a controlled attenuation parameter (CAP)≥274 dB/m. Urinary heavy metals were quantified by inductively coupled plasma mass spectrometry and normalized to urinary creatinine to account for dilution. Four ML algorithms (LightGBM, NNET, SVM, and XGBoost) were implemented. The dataset was split into training (60%) and validation (40%) sets. Results: Among 1,213 adults, 512 were classified with NAFLD and 701 as controls. XGBoost outperformed others, achieving superior performance (AUC=0.7983; Brier score=0.1804). Feature importance was assessed using SHapley Additive exPlanations (SHAP), identifying a minimal subset of 10 features that preserved model performance. The strongest predictors were: body roundness index, triglyceride, diabetes mellitus, sex, age, and urinary concentrations of cadmium, cesium, barium, lead, and tungsten. Both global and local SHAP interpretations validated these features' contributions. The optimized XGBoost model was deployed as a web application (https://wxqdepression.shinyapps.io/nafldapp/). Conclusions: XGBoost demonstrated superior performance in predicting NAFLD using a streamlined set of urinary heavy metals and phenotypic indicators. SHAP-based interpretability confirmed the relevance of this minimal feature set.

Version published to 10.21203/rs.3.rs-7286245/v1 on Research Square
Aug 18, 2025

Prediction of Cognitive Impairment in Elderly Hypertensive Patients in the United States Using Machine Learning Algorithms: A Cross-Sectional Study

This article has 5 authors:
1. Zejing Lin
2. Rulan Ma
3. Xing Chen
4. Yinzhou Wang
5. Xingyong Chen
This article has no evaluationsLatest version Aug 1, 2025
Construction and Validation of an Interpretable Machine Learning Model for Predicting Diabetes Risk in COPD Patients

This article has 13 authors:
1. Lingpin Pang
2. Siyan Xu
3. Yingxin Wang
4. Tao Huang
5. Qian Xian
6. Wenjia Lin
7. Haowen Pang
8. Zhirui Chen
9. Bozhi Zhong
10. Hui Miao
11. Hui Chen
12. Xishi Sun
13. Jie Sun
This article has no evaluationsLatest version Aug 19, 2025
Metabolomics-Guided Machine Learning Reveals Diagnostic and Mechanistic Biomarkers in CHB with MASLD

This article has 10 authors:
1. Chuyang Wang
2. Yutao Chen
3. Huanming Xiao
4. Jianxiong Cai
5. Ruihua Wang
6. Xuan Zeng
7. Ming Lin
8. Wofeng Liu
9. Xiaoling Chi
10. Qubo Chen
This article has no evaluationsLatest version Aug 20, 2025

Listed in

Abstract

Article activity feed

Related articles

Prediction of Cognitive Impairment in Elderly Hypertensive Patients in the United States Using Machine Learning Algorithms: A Cross-Sectional Study

Construction and Validation of an Interpretable Machine Learning Model for Predicting Diabetes Risk in COPD Patients

Metabolomics-Guided Machine Learning Reveals Diagnostic and Mechanistic Biomarkers in CHB with MASLD