A Framework for Locally Imputing and Predicting Biomarker Trajectories Under Irregular Monitoring: Application to Chronic Myeloid Leukemia
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Irregular monitoring and missing data limit the utility of longitudinal biomarkers in real-world practice. We developed a generalizable framework that combines interval-aligned preprocessing, localized multiple imputation, and machine-learning forecasting to generate complete trajectories and predict future biomarker values under routine clinical conditions. Using BCR::ABL1 monitoring in chronic myeloid leukemia as a case study, we aligned measurements to 90-day intervals, applied a windowed, uncertainty-propagating imputation strategy, and trained recurrent neural network (RNN) and XGBoost models to forecast values three and six months ahead. Full Information models achieved RMSEs of 1.22–1.24 for 3-month predictions—well below the biomarker’s observed variability—and maintained accuracy even when the most recent visit was intentionally omitted, simulating extended follow-up. This framework preserves local temporal structure, supports individualized monitoring decisions, and is directly adaptable to other continuous biomarkers measured under irregular real-world schedules.