PTL-PRS: an R package for transfer learning of polygenic risk scores with pseudovalidation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Summary: Polygenic risk scores (PRSs) are essential tools for predicting individual phenotypic risk but often lack accuracy in non-European ancestry groups. Transfer Learning for Polygenic Risk Scores (TL-PRS) addresses this challenge by leveraging European PRSs to improve prediction in underrepresented ancestries but requires privacy-sensitive individual-level data and has low computational efficiency. Therefore, we introduce PTL-PRS (Pseudovalidated Transfer Learning for PRS), an extension of TL-PRS that incorporates pseudovalidation to eliminate the need for individual-level data and includes further software optimization. For pseudovalidation, PTL-PRS generates pseudo-summary statistics for training and validation and evaluates model performance with the pseudo-R metric. To improve computational efficiency, PTL-PRS software was optimized with C++, blockwise early stopping, and direct genotype retrieval. Overall, PTL-PRS enhances both prediction accuracy and software usability, helping underrepresented populations achieve more accurate genetic risk predictions. Availability and Implementation: The PTL.PRS R package is publicly available on GitHub at https://github.com/bokeumcho/PTL.PRS. The summary statistics used in this paper are available in the public domain: UK Biobank (https://pheweb.org/UKB-TOPMed), PGS Catalog (https://www.pgscatalog.org) and GenOMICC (https://genomicc.org/data). Contact: bokeum1810@snu.ac.kr