Flexibly Modeling Rare Variant Pathogenicity Improves Gene Discovery for Complex Traits
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Rare variant burden tests can directly identify genes that influence complex traits, but their power is limited by our ability to separate functional from benign alleles. We introduce FlexRV, an approach that greatly improves the power to detect gene-based associations in rare variant aggregation tests by modelling nonlinear relationships between functional annotations and phenotype. Across 62 quantitative and 44 disease traits in the UK Biobank, we show that FlexRV outperforms previous approaches such as DeepRVAT, STAAR, and Regenie, discovering 51% more quantitative and 102% more disease trait associations than the widely used Regenie method. Compared to discoveries from other methods, gene-phenotype associations identified by FlexRV replicated at a higher rate in the independent All of Us cohort and were more highly enriched at genes nominated by common variant genome-wide association studies. We explore the genetic architecture of complex traits using FlexRV burden tests, finding nearly equal contributions from missense and loss of function variants to rare variant burden heritability. FlexRV weights can also be incorporated into rare variant polygenic scores, improving their ability to identify individuals with extreme phenotypes. Our study illustrates the benefits of modelling nonlinear relationships between annotated variant effects and their downstream phenotypes in rare variant studies.