Selecting variant masks to improve power and replicability of gene-level burden tests

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Rare coding variant association studies typically perform gene-level association tests in which variants are filtered (or “masked”) and aggregated based on functional annotation and allele frequency. As there is little research and no consensus regarding masking strategies to use, we investigated the impact of masking strategies on gene-level burden tests, the most widely used and interpretable type of aggregate association test. A systematic review of 234 studies catalogued 664 masks and masking strategies that rarely repeated across studies. Analyzing 54 traits within 189,947 UK Biobank exomes, we show that the number of significant associations greatly depends on the masking strategy employed (ranging from 58 to 2,523 associations) and, consequently, separate published analyses of this dataset report minimally overlapping associations (<30%). By empirically determining mask combinations that maximize the number of significant associations, we propose masking strategies that detect twice as many significant low-frequency and rare variant associations as the “average” strategies previously employed, with consistent performance across many traits. Our analyses demonstrate the inconsistency of previously used variant masking strategies and provide a simple solution to increase power and replicability in future studies.

Article activity feed