Design and Implementation of an End-to-End AI-Driven Colonoscopy Recall Workflow at Scale
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The rise of structured data elements in Electronic Health Records (EHRs) is a key enabler of improving care quality. However, the transition towards routine use of these fields paradoxically heightens patient safety risks due to increased variability in documentation and the use of placeholder values pending manual review. For large clinical initiatives such as colon cancer screening and surveillance, misinterpretation of recorded clinical data can be particularly problematic, disrupting risk-adapted recall guidance and potentially exacerbating care gaps. This case study details the development and deployment of a Large Language Model (LLM)-driven workflow to extract and transfer unstructured colonoscopy recall recommendations as part of a larger EHR migration. Utilizing GPT-4 Turbo for the core inference step of a fully integrated pipeline —spanning custom SQL queries, Optical Character Recognition (OCR) of historical PDFs, LLM-based inference, and anomaly detection — we successfully structured and migrated population-wide colonoscopy recall data corresponding to over 100,00 patients and 10 years of clinical care. The pipeline demonstrated high accuracy (Macro F1=1.0 against clinician review), scalability, and cost efficiency. We estimate that use of this workflow — relative to the alternative of a default 10-year reminder from last colonoscopy —may prevent over 6,000 new colorectal cancer cases (a projected cost savings of $400-670 million). Key lessons from implementation include the importance of stakeholder alignment, the necessity of robust quality control at scale, and the technical challenges of expanding optimized LLM inference to a fully-fledged end-to-end clinical workflow.