Large Language Models as Ophthalmic Patient Educators: A Comparative Evaluation of Readability, Understandability, and Actionability
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Purpose To compare readability, understandability, and actionability of ophthalmic patient education responses generated by 3 publicly accessible artificial intelligence (AI) platforms: ChatGPT, Perplexity, and Gemini. Methods In this cross-sectional study, high-interest ophthalmology queries were identified using Google Trends to approximate patient information-seeking behavior. Each query was entered verbatim into ChatGPT, Perplexity, and Gemini using standardized single-prompt sessions. Responses were evaluated using Flesch Reading Ease and the Patient Education Materials Assessment Tool (PEMAT) for understandability and actionability. Paired two-sided Wilcoxon signed-rank tests with Holm correction were used for comparisons, with P < .05 considered statistically significant. Results Median Flesch Reading Ease differed between Perplexity and ChatGPT (33.6 vs 41.3; Holm-adjusted P = .003), while Gemini did not differ from ChatGPT (43.6 vs 41.3; Holm-adjusted P = .42). Median PEMAT Understandability was higher for Perplexity compared with ChatGPT (83.3% vs 66.7%; Holm-adjusted P < .001), with no difference between Gemini and ChatGPT (66.7% vs 66.7%; Holm-adjusted P = .41). Median PEMAT Actionability was lower for Perplexity compared with ChatGPT (33.3% vs 66.7%; Holm-adjusted P = .018), and no difference was observed between Gemini and ChatGPT (33.3% vs 66.7%; Holm-adjusted P = .54). Conclusions AI platforms produced ophthalmic patient education responses with significant variability in readability and PEMAT domains. Compared with ChatGPT, Perplexity demonstrated higher understandability but lower actionability, while Gemini showed no differences in these domains. These findings support the need for platform-specific evaluation and optimization of AI-generated patient education before clinical integration.