Assessing ChatGPT’s Performance in Delineating Uveitis: An analysis of responses to real-world case presentations
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
In the world of Artificial Intelligence (AI), Generative Pretrained Transformer-3 (GPT-3), has gained significant popularity for its demonstrated potential in medical education and diagnostics.
Rationale
While AI has shown promising results in healthcare thus far, its understanding of ocular urgencies, particularly uveitis, demands a focused investigation.
Methods
This study explored the application of ChatGPT, a language model derived from GPT-3, in delineating uveitis based on patient presentations and investigations. We analyzed ChatGPT’s communication quality through 14 qualitative metrics by computing patient data at four different levels to act as prompts. These included patient history, drug history, examination findings, and clinical investigations.
Results
Our results showed that at the initial prompt, ChatGPT’s responses were comprehensive for most (8 out of 14) variables and correct but inadequate for some (3 out of 14) variables in the majority (>50.0%) of uveitis cases. Ethical considerations was the only variable in terms of which responses consistently showed mixed accuracy and outdated data across all prompts in most (95.8%) uveitis cases. Also, none of the ChatGPT responses were completely inaccurate in terms of any variable at any prompt for any uveitis case.
Conclusion
The results reveal ChatGPT’s strengths and limitations in answering queries for patients with uveitis or its differential diagnosis while emphasizing the indispensable role of physicians in ethical decision-making.