Calibrated Prediction Intervals for Polygenic Scores: Updated Comparisons, Contextual Calibration, and Data Normalization

Chang Xu
Siyu Hou
Xiang Zhou

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Calibrated prediction intervals for polygenic scores (PGS) are essential for communicating individual-level uncertainty in genomic medicine. We present updated comparisons of two methods for constructing such intervals: CalPred, a parametric approach, and PredInterval, a non-parametric approach. Our results show that both methods can achieve calibrated coverage, although CalPred additionally requires a sufficiently large calibration set. The two methods also exhibit complementary trade-offs with respect to dataset size and risk identification. We further show that contextual calibration, as introduced in Hou et al. and followed in Shi et al., is most naturally achieved through appropriate phenotype normalization and data preprocessing. Apparent miscalibration can arise from inadequate normalization or from providing contextual information to some methods but not others. In UK Biobank, standard GWAS phenotype normalization procedures are sufficient to achieve contextual calibration for traits analyzed. In the extreme simulations of Hou et al. and Shi et al., supplying contextual covariates to PredInterval restores contextual calibration without normalization, and appropriate normalization can achieve contextual calibration without supplying covariates, while also substantially improving upstream tasks including association power and PGS accuracy. Together, these results underscore the central role of phenotype normalization and data preprocessing in GWAS analyses, including reliable uncertainty quantification for PGS.

Version published to 10.64898/2026.05.15.26353336 on medRxiv
May 19, 2026

From GWAS to Causal Inference: A Beginner’s Guide to Mendelian Randomization with Code Examples

This article has 7 authors:
1. Ahmed M Salih
2. Roman Roy
3. Yuhe Wang
4. Irene Treccani
5. Andre Altmann
6. Zahra Raisi-Estabragh
7. Gloria Menegaz
This article has no evaluationsLatest version Apr 9, 2026
Optimizing phenotype scale improves genetic analyses in large-scale biobanks

This article has 3 authors:
1. Zhenhong Huang
2. Manuela Costantino
3. Andy Dahl
This article has no evaluationsLatest version May 7, 2026
Omitted familial extrinsic risk inflates inferred intrinsic lifespan heritability

This article has 1 author:
1. Sergey A. Kornilov
This article has no evaluationsLatest version Apr 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

From GWAS to Causal Inference: A Beginner’s Guide to Mendelian Randomization with Code Examples

Optimizing phenotype scale improves genetic analyses in large-scale biobanks

Omitted familial extrinsic risk inflates inferred intrinsic lifespan heritability