Task-Optimized Machine Learning for High-Accuracy Alzheimer’s Diagnosis from Handwriting Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Training complex models on Alzheimer’s Disease (AD) datasets is challenging due to the costly process of extracting features from a wide range of patient tasks. Developing high-performance AD detection models that rely on a small number of tasks can help reduce dataset acquisition costs and improve the interpretability of the AD detection model. To address this, we propose a two-stage forward-backward feature selection approach to identify the most relevant tasks and features for predicting AD with high accuracy. We evaluate a range of machine learning methods, including Extreme Gradient Boosting (XGBoost), Random Forest, K-Nearest Neighbors, Support Vector Machine, Multi-Layer Perceptron, and Logistic Regression, to determine the best classification model for feature selection and downstream prediction tasks. Given the limited sample size, we assess model performance using Leave-One-Out-Cross-Validation (LOOCV) to ensure robust results. Our method was compared with multiple state-of-the-art approaches for feature selection. The results of our analysis indicate that combining our proposed methods for feature selection with the XGBoost classifier, using only four tasks, produces a model that is both more interpretable and high-performing compared to other approaches. This suggests focusing on these four tasks, rather than collecting extensive task data from patients, can yield a reliable predictor for diagnosis of AD with an accuracy of 91.37%, 93.94% recall, 89.77% precision, and 91.32% F1 score - surpassing other classification methods. This research represents a significant advancement in the efficiency and reliability of AD diagnosis, improving patient prognosis and offering potential benefits to healthcare systems.