The Biobank Rare Variant consortium powers the discovery of rare genetic associations through global collaboration

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Rare coding variants can have large effects on disease risk and provide direct routes from human genetics to disease mechanisms and therapeutic targets, but their discovery is constrained by sample size, particularly for low-prevalence diseases. Here we establish the Biobank Rare Variant Analysis (BRaVa) consortium, a global rare variant association resource that integrates sequencing and linked health-record data from ten biobanks and cohorts comprising over 1.2 million individuals across diverse ancestries.

We performed gene-based meta-analyses of rare coding variation across 33 clinical endpoints and 11 quantitative traits. Aggregating evidence across biobanks and ancestries identified 514 gene-trait associations, including 31 not previously reported in prior studies or curated association resources following systematic literature review. Notably, 36.1% of gene-level associations were undetectable in any individual biobank, and 91 emerged only through cross-ancestry meta-analysis, demonstrating that federated integration enables discovery beyond the reach of single cohorts. Similar gains were observed at the variant level, where 25.0% of phenotype-locus associations were detectable only through meta-analysis.

Effect size estimates were correlated across ancestries with concordant directions of effect, supporting the generalizability of rare variant associations. The identified signals implicate pathways involved in transcriptional and epigenetic regulation, metabolism, vascular and epithelial biology, and immune function, highlighting rare coding variation as an engine for biological discovery across medical record phenotypes. For example, damaging variation in ANKRD12 implicates inflammatory transcriptional dysregulation in asthma and chronic obstructive pulmonary disease, and ultra-rare predicted loss-of-function variants in NAA15 link protein acetylation processes to type 2 diabetes risk.

BRaVa establishes a scalable framework and freely available community resource for rare variant meta-analysis across global biobanks. Public release of gene- and variant-level association summary statistics provides a reference map of rare coding variant associations to support disease gene discovery, biological interpretation, and therapeutic target prioritization as sequencing-linked health-record resources continue to expand.

Article activity feed