Leveraging Global Genetics Resources to Enhance Polygenic Prediction Across Ancestrally Diverse Populations
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction
Genome-wide association studies (GWAS) from multiple ancestral populations are increasingly available, offering opportunities to improve the accuracy and equity of polygenic scores (PGS). Several methods now aim to leverage multiple GWAS sources, but predictive performance and computational efficiency across contexts remain unclear, especially in the absence of individual-level tuning data.
Methods
This study evaluates a comprehensive set of PGS methods across African (AFR), East Asian (EAS), and European (EUR) ancestry groups for 10 complex traits, using summary statistics from the Ugandan Genome Resource, Biobank Japan, and UK Biobank. Single-source PGS were derived using methods including DBSLMM, lassosum, LDpred2, MegaPRS, pT+clump, PRS-CS, QuickPRS, and SBayesRC. Multi-source approaches included PRS-CSx, TL-PRS, X-Wing, and combinations of independently optimised single-source scores. A key contribution is the introduction of a novel application of the LEOPARD method to estimate optimal linear combinations of population-specific PGS using only summary statistics. All analyses were implemented using the GenoPred software pipeline.
Results
In AFR and EAS populations, PGS combining ancestry-aligned and European GWAS outperformed single-source models. Linear combinations of independently optimised scores consistently outperformed current jointly optimised multi-source methods, while being substantially more computationally efficient. The LEOPARD extension offered a practical solution for tuning these combinations when only summary statistics were available, achieving performance comparable to tuning with individual-level data.
Conclusion
These findings highlight a flexible and generalisable framework for multi-source PGS construction. The GenoPred pipeline enables researchers to tailor methods to data availability and study goals, supporting more equitable, accurate, and accessible polygenic prediction.