A Machine-Learning-Based Exploration of ADHD Symptoms Among Higher Education Students in India Achieving a Predictive Accuracy of Up to 76.67%
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Attention-Deficit/Hyperactivity Disorder (ADHD) represents a neurodevelopmental condition characterized by inattention, hyperactivity, and impulsivity, ultimately affecting academic and personal outcomes across various age groups. This study presents a cross-sectional exploration of ADHD-like symptoms among students (N=360) enrolled in higher education institutions in India, with an emphasis on harnessing supervised machine learning to predict potential ADHD classification. Data were collected through a comprehensive questionnaire capturing demographic indicators and symptom-relevant items grounded in clinically aligned constructs. We employed three machine learning models—Logistic Regression, XGBoost, and CatBoost—within a pipeline that included rigorous hyperparameter tuning, dimensionality reduction via Uniform Manifold Approximation and Projection (UMAP), cross-validation, and final evaluation using a hold-out test set. Logistic Regression and CatBoost each achieved a predictive accuracy of 76.67%, surpassing XGBoost at 68.89%. We also performed confusion matrix and Receiver Operating Characteristic (ROC) curve analyses to provide deeper insights into classification performance, alongside a feature examination for each model. These findings spotlight the viability of machine learning in early ADHD symptom detection within an academic context, offering a foundation for targeted interventions, counseling, and resource allocation. By marrying data-driven insights with clinically informed scales, our work underscores the potential for higher education stakeholders to integrate advanced analytics in identifying and supporting students at risk of ADHD-related challenges.