Prompt Engineering in Large Language Models for Patient Education: A Systematic Review

Aya Mudrik
Girish N Nadkarni
Orly Efros
Shelly Soffer
Eyal Klang

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Large language models (LLMs) have shown promise in generating patient-friendly medical content, but their outputs often vary in accuracy, readability, and relevance. Prompt engineering—structuring inputs to guide LLM responses—may improve the quality of educational materials, yet its impact on patient education remains unclear.

Objectives

To systematically review whether prompt engineering improves readability, accuracy, and usability of LLM-generated content for patient education.

Methods

We conducted a systematic review in accordance with PRISMA guidelines. PubMed, Scopus, and Web of Science were searched for original studies evaluating prompt engineering techniques in patient education. Data were extracted on prompt types, LLM models used, and outcomes. Risk of bias was assessed using the QUADAS-2 tool, and a narrative synthesis was performed.

Results

Our search identified five studies that met our criteria, focusing on answering patient questions and generating medical information. Prompt engineering techniques included instruction-based, elaborated, role-defining, scene-defining, and domain-specific prompts. Structured prompting improved accuracy and comprehensiveness in several cases, particularly when specific formats or custom instructions were used. Readability gains were notable when prompts explicitly requested simpler language and reading levels, though some strategies unintentionally increased complexity. Variability in effectiveness across LLMs and prompt types was observed.

Conclusion

Prompt engineering can enhance the clarity and, in some cases, the accuracy of LLM-generated patient education materials. However, benefits vary by model and strategy. Standardized approaches and further research are needed to optimize prompts, minimize bias, and support reliable, accessible patient communication.

Version published to 10.1101/2025.03.28.25324834v1 on medRxiv
Mar 28, 2025

From Lectures to Learning Outcomes: Meaningful Integration of AI-Generated Content in Pre-Clerkship Medical Training

This article has 11 authors:
1. Jay Khurana
2. Hossam A Zaki
3. Ellie Pavlick
4. Jillian Turbitt
5. Heather McGee
6. Sahil Gupta
7. Sriya Sai Pushpa Datla
8. Salma Eldeeb
9. Thais Salazar Mather
10. Sarita Warrier
11. Joyce Ou
This article has no evaluationsLatest version May 13, 2025
Investigating Expectations and Needs of Medical Professionals Regarding the Use of Large Language Models: A Study at German University Clinics

This article has 3 authors:
1. Juraj Vladika
2. Alexander Fichtl
3. Florian Matthes
This article has no evaluationsLatest version Apr 17, 2025
Patterns, Advances, and Gaps in Using ChatGPT and Similar Technologies in Nursing Education: A PAGER Scoping Review

This article has 8 authors:
1. MS Isaac Amankwaa; PhD
2. Emmanuel Ekpor
3. Daniel Cudjoe
4. Emmanuel Kobiah
5. Abdul-Karim Jebuni Fuseini
6. Maximous Diebieri
7. Sabastin Gyamfi
8. Sharon Brownie
This article has no evaluationsLatest version Apr 22, 2025

Listed in

Abstract

Background

Objectives

Methods

Results

Conclusion

Article activity feed

Related articles

From Lectures to Learning Outcomes: Meaningful Integration of AI-Generated Content in Pre-Clerkship Medical Training

Investigating Expectations and Needs of Medical Professionals Regarding the Use of Large Language Models: A Study at German University Clinics

Patterns, Advances, and Gaps in Using ChatGPT and Similar Technologies in Nursing Education: A PAGER Scoping Review