Bridging the Gap: Explainable AI for Autism Diagnosis and Parental Support with TabPFNMix and SHAP
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition that affects a growing number of individuals worldwide. Despite extensive research, the underlying causes of ASD remain largely unknown, with genetic predisposition, parental history, and environmental influences identified as potential risk factors. Diagnosing ASD remains challenging due to its highly variable presentation and overlap with other neurodevelopmental disorders. Early and accurate diagnosis is crucial for timely intervention, which can significantly improve developmental outcomes and parental support. This work presents a novel artificial intelligence (AI) and explainable AI (XAI)-based framework to enhance ASD diagnosis and provide interpretable insights for medical professionals and caregivers. The proposed framework leverages advanced classification models, specifically the TabPFNMix regressor, which is optimized for structured medical datasets. Unlike traditional machine learning methods, TabPFNMix demonstrates superior performance in capturing complex ASD-related patterns. To address the black-box nature of AI models, Shapley Additive Explanations (SHAP) is integrated to provide transparent and interpretable reasoning behind the model’s decisions, ensuring better understanding for clinicians and caregivers. Extensive experiments were conducted using a publicly available benchmark dataset, with performance evaluated through standard metrics such as accuracy, precision, recall, F1-score, and AUC-ROC. Comparative analysis with baseline models, including Random Forest, XGBoost, Support Vector Machine (SVM), and Deep Neural Networks (DNNs), demonstrates that TabPFNMix achieves the highest accuracy (91.5%), surpassing XGBoost (87.3%) by 4.2 percentage points. Additionally, it attains superior recall (92.7%), precision (90.2%), F1-score (91.4%), and AUC-ROC (94.3%), ensuring both high diagnostic accuracy and robustness in real-world ASD screening. An ablation study highlights the significance of feature selection and preprocessing, revealing that omitting key features or preprocessing steps (such as normalization and missing data imputation) significantly degrades performance. Furthermore, SHAP-based feature importance analysis identifies social responsiveness scores, repetitive behavior scales, and parental age at birth as the most influential factors in ASD diagnosis. These insights align with medical literature, reinforcing the reliability of the model’s predictions and its applicability in clinical settings.