Quantifying Explainability in Healthcare AI with the Extended Collaborative Intelligence Index (X-CII): A Synthetic Evaluation Framework

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Human-AI collaboration in healthcare motivates explainable AI (XAI) to promote trust, safety, and regulatory alignment for high-risk systems under the EU AI Act [1] and IMDRF GMLP guidance [2]. We propose the Extended Collaborative Intelligence Index (X-CII), which integrates team quality (Q), effectiveness (E), and safety (S) through a risk-sensitive power mean (λ = 0.25). To link explainability directly to risk mitigation and address critiques of post-hoc XAI [3], our synthetic evaluation applies a conservative +5% multiplicative uplift in team detectability (d'), reflecting reported 5–10% task-performance gains with XAI [16,17]. Under the equal-variance binormal model this increases AUC from 0.800 to approximately 0.813. The uplift modifies only S while keeping Q and E fixed. Unless otherwise stated, relative percentages are referenced to the better individual agent (human or AI). Using 10,000 paired Monte Carlo draws with independent skills (ρ = 0), the XAI-enhanced team achieved a median relative X-CII of 102.963% (IQR 101.24–104.56%) versus the better individual, outperforming it in 89.7% of cases. Versus an identical team without XAI, median X-CII rose by 0.811% (IQR 0.593–1.003%), with a 100% win rate, isolating explainability's incremental contribution. Under domain shift (AUC = 0.72 with adjusted fidelity/reliance parameters), the median remained 102.82%. Lower integration efficiency (η ≤ 0.8) reduced team performance below baseline, whereas negative skill correlation (ρ = -0.5), indicating complementary strengths, increased gains (median 108.66%). Here ρ denotes human–AI skill correlation and η parameterizes integration efficiency (η = 1 ideal). The X-CII framework can help quantify how explainability contributes to safe and effective human–AI teamwork and to benchmark compliance-oriented design. Safety normalization (S = 1 - L / L_worst) ensures bounded, comparable scores, though it compresses high-performance differences. This work provides no legal advice; consult the official EU AI Act and competent authorities for regulatory interpretation.  

Article activity feed