Automatic Processing of Gastrointestinal Endoscopy Referrals and Patient Communication Using Large Language Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Open-access endoscopy relies on referrals that are manually vetted, which is a resource consuming process, with potential biases. We assessed whether large language models (LLMs) can provide accurate recommendations on gastrointestinal endoscopy referrals. Methods We extrracted 200 multilingual endoscopy referrals. We evaluated OpenAI’s o3 and Google’s Gemini 2.5-pro. A prompt was trained and tuned on a set of 20 referrals and tested on the remaining 180 referrals. Eight variables were tested: procedure type, indication, need for anesthesiologist, withdrawal of anti-aggregants, anti-coagulants and GLP-1 agonists, implantable electronic devices and need for intensified preparation. Accuracy and F1 scores were analyzed using bootstrapping, and models compared with McNemar’s test. Confusion matrices were calculated. Additionally, o3 generated patient-specific visual timelines. Results Among 200 referrals, 88 (44%) referred for colonoscopy, 54 (27%) for gastroscopy; 65 (32.5%) required an anaesthesiologist and 65 (32.5%) intensified preparation. o3 achieved 0.91–1.00 accuracy across all eight variables, whereas Gemini ranged from 0.89 to 0.90. Confusion-matrix analysis confirmed high precision and specificity for both models (≥ 0.90 and ≥ 0.92, respectively). O3 generated accurate, patient-specific visual timelines for sampled cases. Conclusion LLMs are highly accurate in processing endoscopy referrals and can generate patient-specific instructions, offering a solution to streamline open-access endoscopy.