Polygenic prediction of phenotypes with a neural empirical Bayes approach
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Polygenic risk scores (PRS) estimate the expected value of a phenotype based on individual genotypes. Although statistical approaches for calculating PRS have advanced considerably in recent years, few methods incorporate recently generated functional genomics atlases to improve SNP weight estimation. Here, we introduce PRS with a Functional Neural Network (PRSFNN) - a novel approach which uses a neural network in an empirical Bayesian framework to learn the links between SNP functional annotations and SNP weights. By learning these links with a neural network, PRSFNN is able to learn complex, non-linear functions of annotations with minimal assumptions. After curating extensive annotations, including ancestry-stratified allele frequencies, chromatin accessibility across hundreds of developmental and adult cell types, transcription factor binding from ENCODE4, quantitative trait loci, and sequence conservation from Zoonomia, we evaluated PRSFNN on 18 continuous complex traits in the UK Biobank. After benchmarking against other leading PRS methods in an out-of-sample test set, we find that PRSFNN outperforms other PRS methods on 17 of 18 traits. Finally, we show that a low-density lipoprotein PRS estimated with PRSFNN outperforms other PRS methods in the prediction of incident cardio-vascular disease. Overall, PRSFNN uses a curated SNP annotation atlas within a neural empirical Bayesian framework to achieve state-of-the-art performance, advancing our ability to predict phenotypic variation from genetic variation.