Predicting Amyloid Positivity Through Proteomic and Machine Learning Approaches
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction Alzheimer's disease is a progressive neurodegenerative disorder where early detection remains difficult. To address this challenge, we analysed a large proteomics dataset from older adults, including individuals diagnosed through clinical and imaging confirmation of brain amyloid deposition. We hypothesized that amyloid positivity could be detected using blood-based proteomic profiles combined with statistical and machine learning methods. Methods We applied descriptive and inferential statistical analyses alongside supervised classification approaches, and group comparisons between amyloid positive and amyloid negative individuals were conducted. Classification methods, including random forests, gradient boosting, and neural networks, were used to evaluate prediction of amyloid status. All data were normalized and privacy compliant. Results Distinct proteomic signatures were associated with disease status. Significant protein expression differences were observed between amyloid positive (n=337) and amyloid negative (n=651) groups. Classification models reached balanced performance with prediction accuracy up AUC of 0.80. Eight proteins (i.e. SERPINA1, C3, CRP, APOE4, CFH, VTN, C1QTNF5, and PON1) emerged as strong predictors from the best-performing classifiers, representing potential biomarker candidates. Discussion and Conclusions Combining statistical and machine learning methods enabled robust identification of patterns distinguishing amyloid profiles. This strategy supports biomarker discovery and development of accessible blood-based diagnostic and therapeutic targets.