Evaluating the Efficacy of ChatGPT in Generating Assessment and Plans for Medical Notes in Urology
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Purpose The increase in medical documentation responsibilities has been implicated in increasing burnout rates of Urologists. AI chatbots like ChatGPT may help in decreasing the documentation workload of Urologists. This study tests the quality of Assessment and Plans created by ChatGPT compared to those created by residents. Methods: 11 fictional cases were submitted to ChatGPT-4 and to four residents at the University of Illinois with instructions to create an assessment and plan for each scenario. The responses were given to 2 attending physicians to grade for accuracy, clarity, and clinical reasoning using Likert-type scales. The graders noted whether false information was present and the perceived identify of the response’s author. The Mann-Whitney U test was used for statistical analysis of Likert-type data. Results: When compared to responses created by residents, ChatGPT had significantly higher scores for clarity, comprehensiveness, and soundness of clinical reasoning. There was no significant difference between the two groups in accuracy of diagnosis and accuracy of the treatment plan. The evaluators misidentified the authors identity in 13.64% of responses. Conclusion: This study aimed to compare the abilities of ChatGPT compared to residents in generating assessment and plans of clinical encounters. The results indicate that ChatGPT is superior in the tested domains when compared to residents.