Machine Learning–Based Prediction of Ultrasound-Detected Metabolic Dysfunction–Associated Steatotic Liver Disease Using Routine Clinical and Biochemical Parameters

Canan Akkus
Gamze Sonmez
Ali Şahin
Melis Gokgoz
Feride Caglar
Sanem Kayhan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background/Objectives: Metabolic dysfunction–associated steatotic liver disease (MASLD) is now the leading cause of chronic liver disease globally, mirroring the increasing prevalence of obesity, insulin resistance, and type 2 diabetes. Early detection of hepatic steatosis is vital for cardiometabolic risk assessment; however, conventional imaging is costly and impractical for population screening. This study aimed to develop interpretable machine-learning models to predict ultrasound-detected MASLD using routinely available clinical and biochemical data. Methods: We analyzed data from 644 adults (50% with MASLD on ultrasonography). Preprocessing, imputation, and feature selection were implemented within a single scikit-learn pipeline to avoid information leakage. An Elastic Net–regularized logistic regression identified the top 20 predictors, which were subsequently used across nine supervised machine learning (ML) classifiers. Model performance was evaluated via repeated stratified 5-fold cross-validation (25 resamples) using accuracy, F1 score, sensitivity, specificity, Youden’s J, balanced accuracy, and Area Under the Receiver Operating Characteristic Curve (AUROC). Interpretability was assessed using SHapley Additive exPlanations (SHAP). Results: Participants with MASLD exhibited greater adiposity, insulin resistance, and dyslipidemia compared with controls [p < 0.05 for body mass index (BMI), waist circumference, glucose, HbA1c, triglycerides). Elastic Net selection highlighted Weight, Ponderal Index, Fibrosis-4 Index (FIB-4), blood urea nitrogen (BUN)/Creatinine ratio, Aspartate Aminotransferase to Platelet Ratio Index (APRI), and Visceral Adiposity Index as the strongest predictors. Logistic Regression and Gradient Boosting achieved the best performance (accuracy = 0.65 ± 0.03; AUROC = 0.71 ± 0.04; balanced accuracy = 0.66 ± 0.06), outperforming rule-based indices such as Fatty Liver Index (FLI) and Hepatic Steatosis Index (HSI) reported in the literature. SHAP analysis confirmed clinically coherent feature effects, with higher anthropometric and hepatic injury indices increasing predicted MASLD probability. Conclusions: Routinely available clinical and biochemical parameters can predict hepatic steatosis with moderate accuracy using transparent, interpretable ML models. Logistic Regression and Gradient Boosting provided the best discrimination and generalizability, offering a pragmatic, low-cost approach for early MASLD screening in primary and metabolic care settings.

Version published to 10.20944/preprints202511.0648.v1
Nov 10, 2025

Development of Machine Learning Algorithms for Predicting Vitamin B12 Levels Using Biochemical Analyte Data

This article has 3 authors:
1. Ferhat Demirci
2. Oktay YILDIRIM
3. Pınar AKAN
This article has no evaluationsLatest version Jan 2, 2026
Association between noninvasive clinical indices and fibrosis in adults with MASLD: A cross-sectional study

This article has 7 authors:
1. Md Nahid Hasan
2. Md Moinul Ahsan
3. Himadri Deb Roy
4. Md Ahiduzzaman
5. Sakib Salam
6. Emily M. Kind
7. Shahidul Islam
This article has no evaluationsLatest version Dec 26, 2025
Comparative Machine Learning Models for Early Prediction of Preterm Birth from Maternal Serum Biomarkers

This article has 7 authors:
1. Kaleem Maqsood
2. Javeria Malik
3. Mahnoor Fatima
4. Sundas Akram
5. Husna Ahmad
6. Nabila Roohi
7. Shahid Bashir
This article has no evaluationsLatest version Dec 16, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Development of Machine Learning Algorithms for Predicting Vitamin B12 Levels Using Biochemical Analyte Data

Association between noninvasive clinical indices and fibrosis in adults with MASLD: A cross-sectional study

Comparative Machine Learning Models for Early Prediction of Preterm Birth from Maternal Serum Biomarkers