Explicitly modeling genetic ancestry to improve polygenic prediction accuracy for height in a large, admixed cohort of US Latinos: Findings from HCHS/SOL

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Polygenic scores (PGS) offer moderate to high prediction accuracy for complex traits, but most are developed in European ancestry cohorts, reducing their performance in populations of other ancestries. This study aimed to improve standing height prediction, a heritable and ancestry-influenced trait, in an admixed Latino cohort (HCHS/SOL) by modeling ancestry using principal components (PCs) alongside PGS. SNPs were selected from a large European ancestry GWAS using various p-value thresholds, and weights were trained using traditional and penalized regression in the UK Biobank (UKB). PGS with PCs were trained separately in HCHS/SOL and UKB. Compared to PGS alone, modeling PGS with PCs substantially improved height prediction in HCHS/SOL (R² increase of ∼0.1), while mild improvements were observed in UKB (R² increase of ∼0.01). These results underscore the importance of incorporating genetic ancestry into predictive models for admixed populations, particularly when the trait exhibits ancestry-specific associations.

Article activity feed