Large Language Models as Ophthalmic Patient Educators: A Comparative Evaluation of Readability, Understandability, and Actionability

Shivam Chandra
Vineet Kumar
Patrianakos Thomas

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Purpose To compare readability, understandability, and actionability of ophthalmic patient education responses generated by 3 publicly accessible artificial intelligence (AI) platforms: ChatGPT, Perplexity, and Gemini. Methods In this cross-sectional study, high-interest ophthalmology queries were identified using Google Trends to approximate patient information-seeking behavior. Each query was entered verbatim into ChatGPT, Perplexity, and Gemini using standardized single-prompt sessions. Responses were evaluated using Flesch Reading Ease and the Patient Education Materials Assessment Tool (PEMAT) for understandability and actionability. Paired two-sided Wilcoxon signed-rank tests with Holm correction were used for comparisons, with P < .05 considered statistically significant. Results Median Flesch Reading Ease differed between Perplexity and ChatGPT (33.6 vs 41.3; Holm-adjusted P = .003), while Gemini did not differ from ChatGPT (43.6 vs 41.3; Holm-adjusted P = .42). Median PEMAT Understandability was higher for Perplexity compared with ChatGPT (83.3% vs 66.7%; Holm-adjusted P < .001), with no difference between Gemini and ChatGPT (66.7% vs 66.7%; Holm-adjusted P = .41). Median PEMAT Actionability was lower for Perplexity compared with ChatGPT (33.3% vs 66.7%; Holm-adjusted P = .018), and no difference was observed between Gemini and ChatGPT (33.3% vs 66.7%; Holm-adjusted P = .54). Conclusions AI platforms produced ophthalmic patient education responses with significant variability in readability and PEMAT domains. Compared with ChatGPT, Perplexity demonstrated higher understandability but lower actionability, while Gemini showed no differences in these domains. These findings support the need for platform-specific evaluation and optimization of AI-generated patient education before clinical integration.

Version published to 10.21203/rs.3.rs-8981384/v1 on Research Square
Mar 20, 2026

Readability, Quality, Understandability, and Actionability of ChatGPT Generated GI Patient Education vs AGA Patient Center

This article has 4 authors:
1. Shivam Chandra
2. Vineet Kumar
3. Robert Kwei-Nsoro
4. Anas Almoghrabi
This article has no evaluationsLatest version Apr 17, 2026
Performance of Vision–Language Models Compared with 252 Medical Students on Text-only and Image-based Dermatology Examinations

This article has 8 authors:
1. Ozan Erdem
2. Abdurrahim Yilmaz
3. Ahmet Sait Sahin
4. Bugra Burc Dagtas
5. Ece Gokyayla
6. Melek Aslan Kayıran
7. Vefa Aslı Erdemir
8. Mehmet Salih Gurel
This article has no evaluationsLatest version Apr 9, 2026
Performance of Chatgpt in Simulated Anesthesia Scenarios: A Prospective Comparison with Expert Clinicians

This article has 3 authors:
1. Agah Abdullah Kahramanlar
2. Ramazan Ince
3. Habip Burak Ozgodek
This article has no evaluationsLatest version Mar 20, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Readability, Quality, Understandability, and Actionability of ChatGPT Generated GI Patient Education vs AGA Patient Center

Performance of Vision–Language Models Compared with 252 Medical Students on Text-only and Image-based Dermatology Examinations

Performance of Chatgpt in Simulated Anesthesia Scenarios: A Prospective Comparison with Expert Clinicians