Predicting Annotation Yield in Artificial Intelligence-Ranked Electronic Health Record Cohorts: A Regression-Based Framework for Efficient Manual Review
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Painstaking manual chart review of EHRs is still the chief bottleneck in retrospective studies, especially when rare-disease cohorts demand high specificity. Automated NLP rankers help, yet when trained on dated data they leave teams guessing how long to keep reviewing charts. We therefore present a regression-based ‘screening-saturation’ model that predicts residual yield at every point along the ranked list. Methods Leveraging a previously validated SVM that ranks notes for pediatric status epilepticus, we trained four predictive models: linear, polynomial, and support-vector regressions plus a lightweight neural net, on notes from 2013 and tested them on data from 2020. Our target was the proportion of true positives (ESE or RSE) expected below any score threshold. Results Polynomial regression offered the best balance of generalizability and interpretability, which demonstrated a strong predictive performance even under temporal data shifts. Regression outputs were used to simulate stopping rules for manual review, which captured 80% of positives after reviewing just 16.6% of notes (an 83% workload cut). Conclusion Our scalable, model-agnostic framework turns AI scores into actionable staffing decisions in clinical workflows. This screening-saturation model integrates with clinician-in-the-loop tools and adapts readily across medical domains that need lean chart review.