Deep-learning-derived glaucoma-related endophenotypes enable novel genome-wide genetic and functional discovery
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The genetic architecture of primary open-angle glaucoma (POAG), a leading cause of irreversible blindness, remains largely unexplained due to the reliance of previous genome-wide association studies (GWAS) on imprecise phenotypes from electronic health records. Here, we overcome this with a disease-trained, task-transfer machine learning (ML) framework that learns glaucoma-related damage patterns from a large clinical repository of 8,323 glaucoma patients. We showed that ML optical coherence tomography (OCT)-derived endophenotypes trained on 18,985 OCT scans from these patients identified novel loci associated with POAG. By applying the derived endophenotypes to 47,908 UK Biobank participants, we performed GWAS in European, African, and Asian ancestral groups followed by cross-ancestry meta-analyses. In total, we identified 36 and 43 LD-independent GWAS loci that passed genome-wide significance in the EUR and cross-ancestry meta-analysis, respectively. About two thirds of the identified loci overlapped with previously reported POAG related associations, demonstrating the validity of our approach. Importantly, more than a third (21) of the loci were novel to glaucoma. Extensive functional analyses, including Bayesian colocalization analysis, gene-based association tests, Mendelian randomization, and single-cell enrichment analysis, converged on 11 high-confidence gene effectors, five of which are novel to glaucoma. These genes support Wnt-mediated outflow dysfunction and retinal ganglion cell vulnerability in POAG pathogenesis and are potential actionable drug targets. Our findings expanded POAG genetic associations, provided mechanistic insights at cell-type resolution, and proposed plausible putative causal genes. This study provides a powerful, generalizable ML-driven strategy for accelerating the discovery of disease mechanisms and therapeutic targets for complex diseases.
One Sentence Summary
A ML framework that bridged clinical eye scans with biobank genetics found 21 new genetic loci and 11 putative causal genes for glaucoma