Assessing Large Language Model Utility and Limitations in Diabetes Education: A Cross-Sectional Study of Patient Interactions and Specialist Evaluations

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objectives

To assess the value of an AI-powered conversational agent in supporting diabetes self-management among adults with diabetic retinopathy and limited educational backgrounds.

Methods

In this cross-sectional study, 51 adults with Type□II diabetes and diabetic retinopathy participated in moderated Q-and-A sessions with ChatGPT. Non-English-speaking and visually impaired participants interacted through trained human support. Each question– response pair was assigned to one of seven thematic categories and independently evaluated by endocrinologists and ophthalmologists using the 3C□+□2 framework (clarity, completeness, correctness, safety, recency). Inter-rater reliability was calculated with intraclass correlation coefficients (ICC) and Fleiss’□Kappa.

Results

The cohort generated 137 questions, and 98□% of the conversational agent’s answers were judged informative and empathetic. Endocrinologists awarded high mean scores for clarity (4.66/5) and completeness (4.52/5) but showed limited agreement (ICC□=□0.13 and□0.27). Ophthalmologists gave lower mean scores for clarity (3.09/5) and completeness (2.94/5) yet demonstrated stronger agreement (ICC□=□0.70 and□0.52). Reviewers detected occasional inaccuracies and hallucinations. Participants valued the agent for sensitive discussions but deferred to physicians for complex medical issues.

Conclusions

An AI conversational agent can help bridge communication gaps in diabetes care by providing accurate, easy-to-understand answers for individuals facing language, literacy, or vision-related barriers. Nonetheless, hallucinations and variable specialist ratings underscore the need for continuous physician oversight and iterative refinement of AI outputs.

Practice implications

Introducing conversational AI into resource-limited clinics could enhance patient education and engagement, provided that clinicians review and contextualise the advice to ensure safety, accuracy, and personalisation. Future development should prioritise reducing hallucinations and bolstering domain-specific reliability so the tool complements, rather than replaces, professional care.

Article activity feed