Calibrated Variant Effect Prediction at the Residue Level Using Conditional Score Distributions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Effective clinical use of variant effect prediction (VEP) requires models that are both accurate and well-calibrated. Calibration refers to a model’s ability to produce meaningful and reliable probability estimates. Here, we propose a practical path toward robust VEP calibration by calibrating at the residue-level rather than using global or per-protein schemes. We identify variant subgroups that benefit from targeted calibration and show that, while VEPs appear well calibrated on average, they remain markedly miscalibrated within these subgroups. Leveraging these insights, we develop RaCoon (Residue-aware Calibration via Conditional distributions), implemented on ESM1b, which provides multicalibrated and interpretable predictions across diverse variant subgroups and significantly improves performance across multiple benchmarks. Targeted residue-level calibration not only improves overall calibration but, for most models, also yields gains in global AUROC. Specifically, RaCoon increases AUCROC from 0.912 to 0.924. Our calibration strategy, guided by model-specific feature distributions, is readily transferable to other VEPs.