LDSC++: Improving linkage disequilibrium score regression estimation of heritability and genetic correlation for multivariate GWAS analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction

Linkage disequilibrium (LD) score regression is widely used for estimating common variant heritability and genetic correlations from genome-wide association study (GWAS) summary statistics. We hypothesise that segmented regression (also known as piecewise regression) improves on previous LD score regression implementations, when estimating both genetic covariance and its standard error.

Methods

We present novel extensions to LD score regression (LDSC++) improving I.) handling of varying numbers of shared genetic variants across trait pairs and reference panels, II.) estimation of genetic covariance and its variance, and III.) handling of imputation quality. We propose supporting statistical tests that use our novel extensions to improve sensitivity, and are further aimed at comparing parameter estimates that are highly correlated, such as those obtained from the same trait but from different methods. We validate LDSC++ first on real-world individual level data from the Genetic Links to Anxiety and Depression study and the United Kingdom National Institute of Health and Social Care Research BioResource (N: 14,190 - 20,144), second on simulated data with different degrees of shared QTL, and third on a battery of publicly available GWASs of ten diverse traits of varying statistical power and heritability.

Results

Using variance-component method (GCTA-GREML) estimates for reference, LDSC++ extensions were found to yield heritability estimates with a bias of about -10% to -20% while standard LD score regression yielded a bias of -30%, and heritability variability estimates with a bias of -1% to -7% while standard LD score regression yielded a bias of 8%. For ten external trait GWASs, LDSC++ was shown to recover 5% to 8% larger heritabilities with 4% smaller variability on average compared to standard LD score regression. Weighting by imputation quality in the model, rather than excluding genetic variants of low imputation quality, contributed to retaining information. Our supporting statistical tests enabled us to detect statistically significant differences in genetic covariance and its standard error while considering the varying number of shared genetic variants across bivariate trait pairs.

Conclusion

LDSC++ was confirmed to produce less biassed estimates of genetic covariance and its variability in our GLAD+ sample compared to standard LD score regression, using GCTA-REML as reference. This performance was supported by results from external trait GWASs of varying character, also implying an important performance of our extended weighting schemes. Our proposed extensions to LD score regression, among which genome-wide parameters are constructed as aggregates of heterogeneous local parameters, may prove important for large-scale multivariate studies such as genomic structural equation models or local genetic covariance analyses.

Article activity feed