Multi-Stage Graph Attention Networks for Interpretable Alzheimer’s Disease Classification from Genome-Wide Association Data

Ankita Saxena
Christopher Gaiteri
Stephen V Faraone

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Genome-wide association studies have identified numerous variants associated with neuropsychiatric disorders. Although some significant loci can carry substantial risk, as in Alzheimer’s Disease, the remaining genetic variance is distributed across many small-effect loci. Polygenic risk scores (PRS) aggregate this risk but do not capture epistatic interactions, and offer limited biological interpretability and predictive accuracy. Computing gene level risk scores and integrating known or statistically validated gene-gene associations has the potential to increase interpretability and/or accuracy. Graph Neural Networks (GNNs) can leverage graph structured genetic data that models potential epistatic interactions to achieve these goals.

Methods

We developed a three-stage Graph Attention Network (GAT) classifier using individual-level GWAS data from 7,358 participants across seven Alzheimer’s Disease Center cohorts. Nodes were defined as genes, with risk scores from AD and 11 genetically correlated phenotypes serving as features. We evaluated two graph construction strategies: gene co-expression networks derived from hippocampal transcriptomic data and curated pathway-based graphs. Additionally, a bilinear context module was incorporated to capture global gene-gene interactions beyond the graph topology. In Stage 1, a GNN encoder was trained on the graphs; Stage 2 injected PRS for non-coding SNPs after the encoder to better capture genetic risk via transfer learning, and Stage 3 applied adversarial training with gradient reversal for ancestry debiasing. GNN predictions were ensembled with whole-genome PRS using elastic net regression.

Results

The best-performing GNN model — a GAT with bilinear context operating on the pathway graph — achieved an AUROC of 0.78 (95% CI: 0.75–0.80). Ensemble models combining Stage 2 or 3 GNN logits with whole-genome PRS achieved an AUROC of 0.82 (0.79–0.84), outperforming PRS alone (0.80). GxI attribution and additional explainability analyses revealed stage-specific biological signals, some of which re-capitulated known gene-phenotype associations and others which may reflect potential new areas of inquiry.

Conclusion

A multi-stage GAT framework captures complementary, non-additive genetic signal that, when ensembled with PRS, improves the accuracy of AD classification. Post-hoc explainability analyses yield biologically interpretable gene networks, supporting the utility of graph-based deep learning for dissecting complex genetic architectures.

Version published to 10.64898/2026.04.06.716790 on bioRxiv
Apr 9, 2026

Integrating network annotation from multiple correlated traits to improve polygenic risk scores based on GWAS summary statistics

This article has 4 authors:
1. Qiuying Sha
2. Lirong Zhu
3. Xuewei Cao
4. Shuanglin Zhang
This article has no evaluationsLatest version Apr 13, 2026
A Multi-Context Regulome-Wide Association Atlas for Genetic Studies of Aging Brain Disorders

This article has 14 authors:
1. Chunming Liu
2. Anqi Wang
3. Hao Sun
4. Kaixuan Luo
5. Sheng Qian
6. Yining Li
7. Xin He
8. Philip De Jager
9. David Bennett
10. Minghui Wang
11. Carlos Cruchaga
12. The Alzheimer’s Disease Functional Genomics Consortium
13. Gao Wang
14. Fabio Morgante
This article has no evaluationsLatest version May 19, 2026
SNPic: SNP Topic Modeling for Interpretable Clustering of Complex Phenotypes

This article has 5 authors:
1. Zhang Leyi
2. Christof Seiler
3. Doug Speed
4. Raphael Micheroli
5. Caroline Ospelt
This article has no evaluationsLatest version Apr 24, 2026

Discuss this preprint

Listed in

Abstract

Background

Methods

Results

Conclusion

Article activity feed

Related articles

Integrating network annotation from multiple correlated traits to improve polygenic risk scores based on GWAS summary statistics

A Multi-Context Regulome-Wide Association Atlas for Genetic Studies of Aging Brain Disorders

SNPic: SNP Topic Modeling for Interpretable Clustering of Complex Phenotypes