Benchmarking non-additive genetic effects on polygenic prediction and machine learning-based approaches
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Polygenic scores (PGSs) are widely used to translate genome-wide association study (GWAS) findings into tools for genetic risk prediction. Most current approaches assume additive effects, yet the contribution of non-additive variation to predictive performance remains unclear. Here we investigate the impact of dominance deviations on polygenic prediction using simulated phenotypes and ten complex traits from the UK Biobank. We compare four approaches: a standard additive PGS, a dominance-adjusted PGS, gradient-boosted decision trees, and neural networks. Across most scenarios, the additive PGS performed robustly, but its accuracy declined when traits had low polygenicity, high SNP heritability, and substantial dominance effects. These results delineate the conditions under which additive models suffice and highlight when more flexible machine learning methods may be advantageous.