Comparing different methods of estimating GWAS heritability with a new approach using only summary statistics
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
So far SNP heritability ( ;variance explained by all SNP s used in genome-wide association study) has explained most of genetic variation for many traits but still there is a gap between GWAS heritability ( ; variance explained by genome-wide significant SNPs) and that is named hidden heritability.
There are several methods for estimating (linear_mixed_model (LMM), PRS, multiple_linear_regression (MLR) and simple_linear_regression(SLR)). However, it is unclear which methods are more accurate under different circumstances. This study proposes a PRS based method for estimating that uses pseudo summary statistics. It compares this method with existing methods using both simulated and real data (10 traits from UKBB) to determine when they are realistic and can be trusted as a final estimate.
Simulation results showed that PRS-based methods underestimate near 20% when considering all causal SNPs. But they are relatively accurate when using a subset of causal SNPs. Their performance is much better than SLR method for all 10 traits, although when applied to real data, they do not follow a stable trend of overestimation or underestimation compared to the base model (LMM).
My suggestion is to use LMM or adjusted_R 2 from MLR for reporting when an independent data set is available. In cases where only summary statistics is available, the PRS-PSS is relatively an accurate alternative, especially compared to SLR, which tends to overestimate by 20-50% when applying it on real data.