Imputation and polygenic score performance of low coverage whole-genome sequencing and genotyping arrays in diverse human populations
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Genome-wide association studies and polygenic score analysis rely on large-scale genotypic data, traditionally obtained through SNP arrays and imputation. However, low coverage whole-genome sequencing has emerged as a promising alternative. This study presents a comprehensive comparison of imputation accuracy and polygenic score performance between eight high-performance genotyping arrays and six low coverage whole-genome sequencing coverage levels (0.5–2x) across diverse populations. We analyze data from 2,504 individuals in the 1000 Genomes Project using a 10-fold cross-imputation strategy to evaluate imputation accuracy and polygenic score performance for four complex traits. Our results demonstrate that low-pass whole-genome sequencing performs competitively with population-specific arrays in both imputation accuracy and polygenic score estimation. Interestingly, low coverage whole-genome sequencing shows superior performances compared to arrays in underrepresented populations and for rare and low-frequency variants. Our findings suggest that low coverage whole-genome sequencing offers a flexible and powerful alternative to genotyping arrays for large-scale genetic studies, particularly in diverse or underrepresented populations.