Genotype to phenotype analysis on the scale of millions of human genomes

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Linear mixed models (LMMs) are the most-established model of the genotype to phenotype relationship in populations of individuals, but they do not scale to the level of hundreds of thousands of genotyped individuals in large human biobanks. We developed a scalable framework for fitting likelihood-based multi-component LMMs with full-rank covariance that scales to several million samples with high statistical accuracy, rivaling exact computation, without the high computational and memory demands of previous state-of-the-art methods. Our scalable LMM (SLMM) implementation can be distributed from one to many computers aiding fast and accurate large-scale analysis. We applied SLMM to examine ~300,000 individuals and estimate heritability, infer selection-related parameters, and incorporate prior knowledge of specific genome components, such as codons, gene promoters and terminators, as well as biological pathway gene sets to perform functional enrichment on several phenotypes from individual data in the UK Biobank.

Article activity feed