Explainability in Action: A Metric-Driven Assessment of Five XAI Methods for Healthcare Tabular Models

M. Atif Qureshi
Abdul Aziz Noor
Awais Manzoor
Muhammad Deedahwar Mazhar Qureshi
Arjumand Younus
Wael Rashwan

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

As explainable AI (XAI) becomes increasingly important in healthcare machine learning (ML) applications, there is a growing need for reproducible frameworks that quantitatively assess the quality of explanations. In this study, we conduct a comparative evaluation of five widely used XAI methods, LIME, SHAP, Anchors, EBM, and TABNET, on multiple healthcare tabular datasets using six well-established metrics: fidelity, simplicity, consistency, robustness, precision, and coverage. While the metrics are derived from existing literature, we formalize and implement them mathematically, providing open-source code to support standardized benchmarking. Empirically, our experiments confirm that SHAP (with TreeSHAP) achieves perfect fidelity in approximating probability outputs for tree-based models, consistent with its theoretical design. LIME offers simpler explanations but sacrifices fidelity. EBM and TABNET demonstrate strong robustness to input perturbations, while Anchors produces precise rule-based explanations with limited data coverage. These results offer practical guidance for selecting XAI methods based on application priorities such as fidelity, robustness, or simplicity. Our open-source framework enables reproducible, quantitative evaluation of XAI techniques in clinical ML workflows. Although evaluated in a clinical context, the proposed framework and metrics are broadly applicable and generalizable to other domains involving tabular data. The source codes are available at https://github.com/matifq/XAI_Tab_Health .

Version published to 10.1101/2025.05.20.25327976v2 on medRxiv
May 23, 2025
Version published to 10.1101/2025.05.20.25327976v1 on medRxiv
May 21, 2025

Advancing Explainable Artificial Intelligence for Clinical Decision Support: Techniques, Challenges, and Evaluation Frameworks in High-Stakes Medical Environments

This article has 2 authors:
1. Owen Graham
2. Pamela Henderson
This article has no evaluationsLatest version May 28, 2025
Resampling Methods for Class Imbalance in Clinical Prediction Models: A Systematic Review and Meta-Regression Protocol

This article has 4 authors:
1. Osama Abdelhay
2. Adam Shatnawi
3. Hassan Najadat
4. Taghreed Altamimi
This article has no evaluationsLatest version May 20, 2025
A generalized Data Quality Assessment Framework for Diverse Health Datasets with varied Contradiction Rules

This article has 8 authors:
1. Khalid O. YUSUF
2. Katharina JÖRẞ
3. Jendrik RICHTER
4. Sabine HANSS
5. Irina CHAPLINSKKAYA-SOBOL
6. Robert KOSSEN
7. Lennart GRAF
8. Dagmar KREFTING
This article has no evaluationsLatest version Jun 23, 2025

Listed in

Abstract

Article activity feed

Related articles

Advancing Explainable Artificial Intelligence for Clinical Decision Support: Techniques, Challenges, and Evaluation Frameworks in High-Stakes Medical Environments

Resampling Methods for Class Imbalance in Clinical Prediction Models: A Systematic Review and Meta-Regression Protocol

A generalized Data Quality Assessment Framework for Diverse Health Datasets with varied Contradiction Rules