Comparative Analysis of Nursing Care Plans Produced by Artificial Intelligence Models (ChatGPT, Gemini, Deepseek) in Terms of Readability, Reliability and Quality
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background While AI chatbots make healthcare information more accessible, there is still limited research on the readability, trustworthiness, and overall quality of the nursing care plans they generate. Purpose The research aims to investigate how AI-driven chatbots like ChatGPT, Gemini, and DeepSeek generate nursing care plan texts in terms of readability, reliability, and overall quality. Methods A total of 30 nursing diagnoses were randomly selected from the NANDA 2021–2023 taxonomy. For each diagnosis, care plans were generated by three different AI chatbots, yielding 90 texts in total. The generated plans were evaluated through a descriptive criteria form , the DISCERN tool for health information quality, and multiple readability measures (FRES, SMOG, Gunning Fog Index, and Flesch-Kincaid Grade Level). Results The analysis revealed that the nursing care plans generated by ChatGPT, Gemini, and DeepSeek had readability scores significantly above the standard sixth-grade level (P < .001). DISCERN analysis yielded average scores of 57.41 ± 5.9 for ChatGPT, 58.41 ± 4.8 for Gemini, and 56.51 ± 6.8 for DeepSeek, reflecting moderate reliability overall. Among the generated texts, 27 (90%) offered information rated as moderate in quality. Moreover, the inclusion of verifiable references showed a statistically significant positive relationship with both reliability and quality measures (P < .05). Conclusion Artificial intelligence chatbots cannot replace complete nursing care plans. For AI-driven tools, it is advised to improve the clarity of the generated content, include reliable references, and have the material reviewed by professionals.