ESI Triage Level Assignment for Headache Patients: Comparative Analysis of ChatGPT and Gemini Performance for Supporting Care Provider Decisions and Self-triage
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objective This study evaluated the performance of two advanced large language models (LLMs), ChatGPT and Gemini, in supporting triage decisions for headache patients in emergency settings via the Emergency Severity Index (ESI) from both patient self-triage and healthcare provider perspectives. Methods Data, including 500 records of patients presenting with headache complaints, were obtained from the MIMIC-IV-ED database. Two distinct prompt types were created: one for self-triage to assist patients in assessing their care needs on the basis of symptom descriptions and another for healthcare providers to determine ESI levels. Each model's output was compared to actual ESI levels via precision, recall, and F1 scores to measure performance. Results ChatGPT achieved greater accuracy at lower acuity levels (ESIs 3 and 4), accurately identifying patients who did not require urgent care. Gemini demonstrated improved performance at higher acuity levels (ESIs 1 and 2), indicating its ability to recognize critical cases effectively. Both models showed stronger performance with healthcare provider prompts than with self-triage prompts, underscoring the importance of structured input for accurate triage assessments. This variation highlights the need to refine self-triage prompts to ensure safe and precise use. Conclusion ChatGPT and Gemini show promise as decision-support tools for ED triage, particularly for assisting healthcare providers in prioritizing cases on the basis of acuity. However, further refinement is needed to increase accuracy in self-triage scenarios. Future studies should validate these findings across a broader dataset and explore the integration of LLMs into clinical decision support systems to strengthen triage reliability and effectiveness.