Chatbots' Performance in Premature Ejaculation Questions: A Comparative Analysis of Reliability, Readability, and Understandability

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective This study aimed to evaluate the reliability, readability, and understandability of chatbot responses to frequently asked questions about premature ejaculation, and to assess the contributions, potential risks, and limitations of artificial intelligence. Methods Fifteen questions were selected using data from Google Trends and posed to the chatbots Copilot, Gemini, ChatGPT 4o, ChatGPT 4o Plus, and DeepSeek-R1. Reliability was evaluated using the Global Quality Scala (GQS) scale by two experts, readability was assessed with the Flesch Reading Ease (FRES), Flesch-Kincaid Grade Level (FKGL), Gunning Fog Index (GFI), and Simple Measure of Gobbledygook (SMOG) scales, and understandability was evaluated using the Patient Educational Materials Assessment Tool for Printable Materials (PEMAT-P) scale. Additionally, the consistency of source citations was examined. Results The GQS scores were as follows: Copilot: 3.96 ± 0.66, Gemini: 3.66 ± 0.78, ChatGPT 4o: 4.83 ± 0.23, ChatGPT 4o Plus: 4.83 ± 0.29, DeepSeek: 4.86 ± 0.22. The PEMAT-P scores were: Copilot: 0.70 ± 0.05, Gemini: 0.72 ± 0.04, ChatGPT 4o: 0.83 ± 0.03, ChatGPT 4o Plus: 0.77 ± 0.06, DeepSeek: 0.79 ± 0.06. While ChatGPT and DeepSeek scored higher for reliability and understandability, all chatbots performed at an acceptable level. However, readability scores were above the recommended level for the target audience. Instances of low reliability or unverified sources were noted, with no significant differences between the chatbots. Conclusion Chatbots provide highly reliable and informative responses regarding premature ejaculation; however, it is evident that there are significant limitations that require improvement, particularly concerning readability and the reliability of sources.

Article activity feed