Development and Validation of an Interpretable Machine Learning Model to Identify Coexisting Type 2 Diabetes Mellitus in Patients with Metabolic dysfunction-associated fatty liver disease

Hui Zhu
Jia Zhang
Xi Xu
Yi Lv
Chenxia Lu
Qi Hao
Jingjing Huang
Miao Peng
Jingzhi Wang
Ouyang Kani
Zixin Shu
Shujie Song
Xiaodong Li
Mingzhong Xiao

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Patients with metabolic dysfunction-associated fatty liver disease (MAFLD) have a significantly higher risk of type 2 diabetes mellitus (T2DM), but hepatocentric risk prediction remains underexplored. This study aims to develop and validate an interpretable machine learning model for identifying concomitant T2DM in MAFLD patients. A prospective cohort of 4,472 MAFLD patients (2022–2025) was analyzed, with random allocation to training (n=3,129) and validation (n=1,343) sets. Four machine learning models were compared, with Boruta and LASSO algorithms used for feature selection. Model performance was evaluated using ROC-AUC, PR-AUC, calibration plots, and SHAP analysis for interpretability. XGBoost demonstrated the best performance with a validation ROC-AUC of 0.799 (95% CI: 0.763–0.835). The final model incorporated eight variables: age, triglycerides, controlled attenuation parameter, liver stiffness measurement, ALT, AST, hsCRP, and eGFR. SHAP analysis identified age, triglycerides, and liver stiffness measurement as predominant predictors. Risk stratification partitioned patients into low, intermediate, and high-risk tiers with progressive T2DM prevalence (7.4%, 28.1%, and 42.1%, respectively). This XGBoost-based framework provides a clinically viable tool for early T2DM identification in MAFLD patients, facilitating tailored metabolic intervention. Trial registration: Chinese Clinical Trial Registry (ChiCTR), ChiCTR2200063127, registered on August 31st, 2022

Version published to 10.21203/rs.3.rs-9265036/v1 on Research Square
Apr 17, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed