Assessing AI-Generated Autism Information for Healthcare Use: A Cross-Linguistic and Cross-Geographic Evaluation of ChatGPT, Gemini, and Copilot
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (PREreview)
Abstract
Background/Objectives: Autism is one of the most prevalent neurodevelopmental conditions globally, and healthcare professionals including pediatricians, developmental specialists, and speech–language pathologists, play a central role in guiding families through diagnosis, treatment, and support. As caregivers increasingly turn to digital platforms for autism-related information, artificial intelligence (AI) tools such as ChatGPT, Gemini, and Microsoft Copilot are emerging as popular sources of guidance. However, little is known about the quality, readability, and reliability of information these tools provide. This study conducted a detailed comparative analysis of three widely used AI models within defined linguistic and geographic contexts to examine the quality of autism-related information they generate. Methods: Responses to 44 caregiver-focused questions spanning two key domains—foundational knowledge and practical supports—were evaluated across three countries (USA, England, and Türkiye) and two languages (English and Turkish). Responses were coded for accuracy, readability, actionability, language framing, and reference quality. Results: Results showed that ChatGPT generated the most accurate content but lacked reference transparency; Gemini produced the most actionable and well-referenced responses, particularly in Turkish; and Copilot used more accessible language but demonstrated lower overall accuracy. Across tools, responses often used medicalized language and exceeded recommended readability levels for health communication. Conclusions: These findings have critical implications for healthcare providers, who are increasingly tasked with helping families evaluate and navigate AI-generated information. This study offers practical recommendations for how providers can leverage the strengths and mitigate the limitations of AI tools when supporting families in autism care, especially across linguistic and cultural contexts.
Article activity feed
-
-
This Zenodo record is a permanently preserved version of a Structured PREreview. You can view the complete PREreview at https://prereview.org/reviews/17344933.
Does the introduction explain the objective of the research presented in the preprint? Yes The introduction clearly explains that the objective is to provide a comprehensive assessment of the usefulness of AI-generated information for healthcare providers and families across different regions and languages, thereby offering practical recommendations for improving autism careAre the methods well-suited for this research? Highly appropriate The methods are highly appropriate because …This Zenodo record is a permanently preserved version of a Structured PREreview. You can view the complete PREreview at https://prereview.org/reviews/17344933.
Does the introduction explain the objective of the research presented in the preprint? Yes The introduction clearly explains that the objective is to provide a comprehensive assessment of the usefulness of AI-generated information for healthcare providers and families across different regions and languages, thereby offering practical recommendations for improving autism careAre the methods well-suited for this research? Highly appropriate The methods are highly appropriate because they employ a cross-linguistic (Turkish/English) and cross-geographic design, directly addressing objective variability in AI performance across contexts. They use a standardized set of 44 caregiver-focused questions to ensure consistent comparative input across ChatGPT, Gemini, and Copilot. The assessment relies on validated, specific measurement tools (3Cs framework for accuracy, FKGL for readability, and PEMAT-P for actionability) to evaluate information quality systematically.Are the conclusions supported by the data? Highly supportedAre the data presentations, including visualizations, well-suited to represent the data? Highly appropriate and clear The presented comprehensive tables provided detailed descriptive statistics (Means, SDs) for all combinations of the three LLMs, two languages, and three locations for accuracy, readability, actionability, and reference quality. Comparative figures (Figures 1–6) visually summarize the critical main effects and interaction effects (e.g., LLM × Location/Language) identified by the statistical analysis (ANOVA/MLM) for accuracy, readability, and actionability. This presentation effectively communicates the nuanced performance differences across the tools and contexts.How clearly do the authors discuss, explain, and interpret their findings and potential next steps for the research? Very clearly Categorizing Interpretations by Tool Performance: The response explicitly breaks down the authors' explanations of findings based on the performance of each AI tool (ChatGPT, Gemini, Copilot), highlighting the nuance in interpretation. For example, the authors clarify the trade-off between ChatGPT's high accuracy and its lack of references, and explain that Gemini's superior actionability in Turkish was a "surprising result" requiring further investigation. Focusing on Systemic Gaps: The response highlights the authors' discussion of pervasive issues across all models, specifically, the failure of all LLMs to meet readability standards (6th–8th grade level) and their reliance on medicalized terminology over neurodiversity-affirming language. Providing Structured Next Steps (Implications and Research): The discussion of future steps is highly clear because the authors separated them into two distinct, actionable categories (Implications for Practice(Operational Next Steps) and Recommendations for Future Research (Academic Next Steps))Is the preprint likely to advance academic knowledge? Highly likely Yes, the preprint is likely to advance academic knowledge because it systematically addresses a significant gap regarding the quality, readability, and reliability of autism information provided by AI tools, a domain where research is currently sparse.Would it benefit from language editing? Yes for formal submission to a publisher or broader audience, the preprint would benefit from language editing to ensure maximum academic conciseness and flow, despite the fact that the core findings are already clearly and rigorously presentedWould you recommend this preprint to others? Yes, it's of high qualityIs it ready for attention from an editor, publisher or broader audience? Yes, after minor changes 1. Language Editing: The preprint requires professional refinement to improve grammatical consistency, clarify awkward phrasing, and ensure optimal academic conciseness and flow [Conversation History]. 2. Peer Review and Validation: For attention from a publisher or a broader audience seeking validated results, the preprint needs to successfully complete the peer review process, as it is currently the "Not peer-reviewed version" of the research outputCompeting interests
The author declares that they have no competing interests.
Use of Artificial Intelligence (AI)
The author declares that they did not use generative AI to come up with new ideas for their review.
-
This Zenodo record is a permanently preserved version of a Structured PREreview. You can view the complete PREreview at https://prereview.org/reviews/17285109.
Does the introduction explain the objective of the research presented in the preprint? Yes The introduction points out why the research has to be done. It tells why the set of research questions came into consideration originally. It gives an overview of the complete research done including the results of the tests performed and the conclusion. Thus, telling about the current trends of using AI to answer Autism questions, it tells about the pitfalls of it and why the research was necessary to be done.Are the methods well-suited for this …This Zenodo record is a permanently preserved version of a Structured PREreview. You can view the complete PREreview at https://prereview.org/reviews/17285109.
Does the introduction explain the objective of the research presented in the preprint? Yes The introduction points out why the research has to be done. It tells why the set of research questions came into consideration originally. It gives an overview of the complete research done including the results of the tests performed and the conclusion. Thus, telling about the current trends of using AI to answer Autism questions, it tells about the pitfalls of it and why the research was necessary to be done.Are the methods well-suited for this research? Highly appropriate The methods are well suited for the research, keeping in mind all the parameters required so that the research questions can be well answered. Data Analysis done using a combination of mixed design ANOVAs and Multi level modeling to evaluate the effects of the LLMs was highly resourceful. Including the different languages for basis of input of questions was important. And checking the responses by different researchers was crucial to avoid bias errors. Thus, efficient way of carrying out the research.Are the conclusions supported by the data? Highly supported The conclusion is thorough and support the data, as according to the data, the advantages of Chatgpt over Gemini and Copilot are mentioned. But, also the drawbacks of each of the AI models is mentioned accepting the data results. Thus, proving that the research questions were answered and how the data results help in coming to a fair conclusion of the impact of the research on further practice of doctors. All this is mentioned in the conclusion, not exaggerating too much which is not mentioned in the data.Are the data presentations, including visualizations, well-suited to represent the data? Highly appropriate and clear The data presentations are well suited, as tables are included to explain the variables used in the research and their interrelation.How clearly do the authors discuss, explain, and interpret their findings and potential next steps for the research? Very clearly The authors have clearly explained the results of the research, the advantages of Chatgpt over Gemini and Copilot after doing the thorough analysis. But, also mentioned the drawbacks of all three AI models, as I Mentioned earlier. After concluding the drawbacks, it is mentioned how the findings can be helpful for doctors to explain patients on how AI should be used to answer health related questions. And, how AI still has gaps in answering those questions.Is the preprint likely to advance academic knowledge? Somewhat likely The preprint only confirms the findings of the research done, but still there are a lot of unanswered questions about AI and its gaps, and if any improvements can be done on the models.Would it benefit from language editing? No I think the language of the preprint is well said, no impact is seen on understanding the context of the preprint.Would you recommend this preprint to others? Yes, it's of high quality Yes, the preprint is really helpful as a general Insight for others on AI models and how they work in answering Health care questions.Is it ready for attention from an editor, publisher or broader audience? Yes, as it is I don't think any changes are necessary as such. So, can be reviewed by an editor.Competing interests
The author declares that they have no competing interests.
Use of Artificial Intelligence (AI)
The author declares that they did not use generative AI to come up with new ideas for their review.
-