Personalized polygenic risk prediction and assessment with a Mixture-of-Experts framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
With the increasing availability of high quality genomic data from diverse cohorts, polygenic scores (PRS) have become a mainstay of genetic analyses of complex traits and diseases. Despite their proliferation in numerous research domains, a major obstacle to wider adoption in clinical settings has been the well-established heterogeneity in prediction accuracy across a variety of demographic variables, such as age, sex, and genetic ancestry. To address this deficiency, recent research efforts aimed to improve representation in genetic studies and develop stratified PRS inference methods that greatly enhanced accuracy in minority populations. However, with these stratified scores in hand, it remains unclear how to assign the best score, or mixture of scores, for a particular test individual in the clinic. To bridge this gap, we present MoEPRS , an ensemble learning method based on the Mixture-of-Experts framework, that blends the stratified scores using personalized mixing weights to predict the target phenotype. In biobank-scale analyses of 7 complex traits in the UK and CARTaGENE biobanks, we show that MoEPRS generally provides modest improvements in prediction accuracy over single source PRS models and its predictive performance is maintained across biobanks. Furthermore, we demonstrate practical use cases where the model automatically identifies and adapts to diverse sources of heterogeneity in the data, which allows for evaluating the strengths and weaknesses of current polygenic scores across various cohort strata.