A Basic Trustworthy Machine Learning Framework for Early Diabetes Detection

Kazi Sakib Hasan

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This research presents a comprehensive trustworthy machine learning framework for early diabetes detection, addressing critical gaps in reliability, interpretability, and fairness in clinical AI systems. The study integrates causal inference, modern ensemble methods (LightGBM, XGBoost-DART, HistGBM), and TabNet for tabular deep learning to enhance predictive performance while ensuring transparency. A novel Causal-guided Stacking Classifier (CGSC) is introduced, utilizing LightGBM as a meta-learner trained on causally relevant features identified through Causal Forests. The framework emphasizes interpretability through SHAP-based global and local explanations and leverages TabNet’s intrinsic attention mechanism for feature-level insights. Counterfactual reasoning (DiCE) enables personalized risk mitigation strategies by identifying minimal feature changes to alter predictions. To promote fairness, gender is excluded as a direct feature, reducing demographic bias. Experimental results demonstrate robust performance: CGSC achieves the highest recall (0.81), critical for early warning systems, while TabNet attains superior precision (0.79). Uncertainty quantification reveals stable F1-scores (0.73 ± 0.03) across ensemble models. Key causal drivers include general health (ATE = 0.1392) and cardiovascular factors, while counterintuitive findings like alcohol consumption’s negative association (ATE = -0.1875) warrant further investigation. The framework’s emphasis on causal feature selection, model transparency, and actionable explanations aligns with healthcare requirements for trustworthy AI, offering a reproducible solution for diabetes risk stratification with potential clinical applicability. All experiments are fully reproducible, with resources available at the GitHub repository.

Version published to 10.20944/preprints202505.0292.v2
Jun 13, 2025
Version published to 10.20944/preprints202505.0292.v1
May 7, 2025

A Basic Trustworthy Machine Learning Framework for Early Diabetes Detection

This article has 1 author:
1. Kazi Sakib Hasan
This article has no evaluationsLatest version Jun 13, 2025
Actionable and Interpretable ML-Based Early Warning Systems for Divorce Incorporating Causal Inference and Counterfactuals

This article has 2 authors:
1. Kazi Sakib Hasan
2. Most. Afia Anjum Borsha
This article has no evaluationsLatest version May 12, 2025
Advancing Cardiovascular Disease Diagnosis: A Robust ML Ecosystem Integrating Early Detection, Responsible AI Framework, and Causal Inference

This article has 2 authors:
1. Kazi Sakib Hasan
2. Irfan Sadi Dhrubo
This article has no evaluationsLatest version Jun 18, 2025

Listed in

Abstract

Article activity feed

Related articles

A Basic Trustworthy Machine Learning Framework for Early Diabetes Detection

Actionable and Interpretable ML-Based Early Warning Systems for Divorce Incorporating Causal Inference and Counterfactuals

Advancing Cardiovascular Disease Diagnosis: A Robust ML Ecosystem Integrating Early Detection, Responsible AI Framework, and Causal Inference