A Reasoning Pathway Explanation Framework for Clinical AI: Methods and Evaluation

Yunguo Yu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objective : When AI predicts acute myocardial infarction, existing explanation methods identify which features mattered (e.g., elevated troponin) but not how risk factors lead to the diagnosis through biological mechanisms. We developed a framework that generates reasoning pathways—clinically grounded chains linking risk factors, pathophysiology, and evidence—to address this gap. Methods : Using a 34-node cardiology reasoning graph, MIMIC-III data, and medical ontologies (SNOMED CT, UMLS), we built an explanation engine that maps AI predictions to temporally ordered, evidence-linked pathways. We evaluated 100 AMI cases (87 expanded, 13 independent) with six structural metrics, adversarial validation, and comparison with BioBERT and SHAP. Three physicians independently rated 11 cases across five clinical quality dimensions. Results : The framework generated consistent reasoning pathways across all cases (pathwayprediction consistency 0.85 ± 0.01). Adversarial validation confirmed discriminative power for this metric (AUC-ROC 0.81) but not others. In the physician pilot, inter-rater agreement was strong (ICC = 0.83) and all three evaluators detected the deliberately flawed control case (2.13/5 vs. 3.92/5 for genuine cases). Evidence sufficiency for complex independent cases was the primary concern identified. Conclusion : Structured reasoning pathways that trace clinical logic from risk factors to diagnosis can be generated and evaluated systematically. Physicians agreed on pathway quality (ICC = 0.83) and consistently rejected flawed explanations, though evidence depth for complex cases requires improvement.

Version published to 10.21203/rs.3.rs-9418990/v1 on Research Square
Apr 16, 2026

Agentic AI in Healthcare: Bridging the Gap Between Computational Promise and Clinical Evidence

This article has 1 author:
1. Yunguo Yu
This article has no evaluationsLatest version Apr 14, 2026
The Inefficacy of Artificial Intelligence Large Language Models in Healthcare: A Clinical and Statistical Perspective

This article has 4 authors:
1. Michael Williams
2. Raeed Kabir
3. Cody Taylor
4. Tariq Nakhooda
This article has no evaluationsLatest version Apr 27, 2026
The Inefficacy of Artificial Intelligence Large Language Models in Healthcare: A Clinical and Statistical Perspective

This article has 4 authors:
1. Michael Williams
2. Raeed Kabir
3. Cody Taylor
4. Tariq Nakhooda
This article has no evaluationsLatest version Apr 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Agentic AI in Healthcare: Bridging the Gap Between Computational Promise and Clinical Evidence

The Inefficacy of Artificial Intelligence Large Language Models in Healthcare: A Clinical and Statistical Perspective

The Inefficacy of Artificial Intelligence Large Language Models in Healthcare: A Clinical and Statistical Perspective