The distribution of highly deleterious variants across human ancestry groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
A major focus of human genetics is to map severe disease mutations. Increasingly, that goal is understood as requiring huge numbers of people to be sequenced from every broadly defined genetic ancestry group, so as not to miss “ancestry-specific variants.” Here, we consider whether this focus is warranted. We start from first principles considerations, based on models of mutation–drift-selection balance, which suggest that since severe disease mutations tend to be strongly deleterious, and thus evolutionarily young, they will be kept at relatively constant frequency through recurrent mutation. Therefore, highly pathogenic alleles should be shared identically by descent within extended families, not broad ancestry groups, and sequencing more people should yield similar numbers regardless of ancestry. We test the model predictions using gnomAD genetic ancestry groupings and show that they provide a good fit to the classes of variants most likely to be highly pathogenic, notably sets of loss of function alleles at strongly constrained genes. These findings clarify that strongly deleterious alleles will be found at comparable rates in people of all ancestries, and the information they provide about human biology is shared across ancestries.