Improved Allele Frequencies in gnomAD through Local Ancestry Inference
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The Genome Aggregation Database (gnomAD) is a foundational resource for allele frequency data, widely used in genomic research and clinical interpretation. However, traditional estimates rely on individual-level genetic ancestry groupings that may obscure variation in recently admixed populations. To improve resolution, we applied local ancestry inference (LAI) to over 27 million variants in two admixed groups: Admixed American (n = 7,612) and African/African American (n = 20,250), deriving ancestry-specific allele frequencies. We show that 78.5% and 85.1% of variants in these groups, respectively, exhibit at least a twofold difference in ancestry-specific frequencies. Moreover, 81.49% of variants with LAI information would be assigned a higher gnomAD-wide maximum frequency after incorporating LAI, potentially altering clinical interpretations. This LAI-informed release reveals clinically relevant frequency differences that are masked in aggregate estimates and may support reclassifying some variants from Uncertain Significance to Benign or Likely Benign.