Assessing AI-Generated Autism Information for Healthcare Use: A Cross-Linguistic and Cross-Geographic Evaluation of ChatGPT, Gemini, and Copilot

Salih Rakap
Emrah Gulboy
Uygar Bayrakdar
Goksel Cure
Busra Besdere
Burak Aydin

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (PREreview)

Abstract

Background/Objectives: Autism is one of the most prevalent neurodevelopmental conditions globally, and healthcare professionals including pediatricians, developmental specialists, and speech–language pathologists, play a central role in guiding families through diagnosis, treatment, and support. As caregivers increasingly turn to digital platforms for autism-related information, artificial intelligence (AI) tools such as ChatGPT, Gemini, and Microsoft Copilot are emerging as popular sources of guidance. However, little is known about the quality, readability, and reliability of information these tools provide. This study conducted a detailed comparative analysis of three widely used AI models within defined linguistic and geographic contexts to examine the quality of autism-related information they generate. Methods: Responses to 44 caregiver-focused questions spanning two key domains—foundational knowledge and practical supports—were evaluated across three countries (USA, England, and Türkiye) and two languages (English and Turkish). Responses were coded for accuracy, readability, actionability, language framing, and reference quality. Results: Results showed that ChatGPT generated the most accurate content but lacked reference transparency; Gemini produced the most actionable and well-referenced responses, particularly in Turkish; and Copilot used more accessible language but demonstrated lower overall accuracy. Across tools, responses often used medicalized language and exceeded recommended readability levels for health communication. Conclusions: These findings have critical implications for healthcare providers, who are increasingly tasked with helping families evaluate and navigate AI-generated information. This study offers practical recommendations for how providers can leverage the strengths and mitigate the limitations of AI tools when supporting families in autism care, especially across linguistic and cultural contexts.

Version published to 10.3390/healthcare13212758
Oct 30, 2025
PREreview
Oct 13, 2025

This Zenodo record is a permanently preserved version of a Structured PREreview. You can view the complete PREreview at https://prereview.org/reviews/17344933.

Does the introduction explain the objective of the research presented in the preprint? Yes The introduction clearly explains that the objective is to provide a comprehensive assessment of the usefulness of AI-generated information for healthcare providers and families across different regions and languages, thereby offering practical recommendations for improving autism care

Are the methods well-suited for this research? Highly appropriate The methods are highly appropriate because …

This Zenodo record is a permanently preserved version of a Structured PREreview. You can view the complete PREreview at https://prereview.org/reviews/17344933.

Does the introduction explain the objective of the research presented in the preprint? Yes The introduction clearly explains that the objective is to provide a comprehensive assessment of the usefulness of AI-generated information for healthcare providers and families across different regions and languages, thereby offering practical recommendations for improving autism care

Are the methods well-suited for this research? Highly appropriate The methods are highly appropriate because they employ a cross-linguistic (Turkish/English) and cross-geographic design, directly addressing objective variability in AI performance across contexts. They use a standardized set of 44 caregiver-focused questions to ensure consistent comparative input across ChatGPT, Gemini, and Copilot. The assessment relies on validated, specific measurement tools (3Cs framework for accuracy, FKGL for readability, and PEMAT-P for actionability) to evaluate information quality systematically.

Are the conclusions supported by the data? Highly supported

Are the data presentations, including visualizations, well-suited to represent the data? Highly appropriate and clear The presented comprehensive tables provided detailed descriptive statistics (Means, SDs) for all combinations of the three LLMs, two languages, and three locations for accuracy, readability, actionability, and reference quality. Comparative figures (Figures 1–6) visually summarize the critical main effects and interaction effects (e.g., LLM × Location/Language) identified by the statistical analysis (ANOVA/MLM) for accuracy, readability, and actionability. This presentation effectively communicates the nuanced performance differences across the tools and contexts.

How clearly do the authors discuss, explain, and interpret their findings and potential next steps for the research? Very clearly Categorizing Interpretations by Tool Performance: The response explicitly breaks down the authors' explanations of findings based on the performance of each AI tool (ChatGPT, Gemini, Copilot), highlighting the nuance in interpretation. For example, the authors clarify the trade-off between ChatGPT's high accuracy and its lack of references, and explain that Gemini's superior actionability in Turkish was a "surprising result" requiring further investigation. Focusing on Systemic Gaps: The response highlights the authors' discussion of pervasive issues across all models, specifically, the failure of all LLMs to meet readability standards (6th–8th grade level) and their reliance on medicalized terminology over neurodiversity-affirming language. Providing Structured Next Steps (Implications and Research): The discussion of future steps is highly clear because the authors separated them into two distinct, actionable categories (Implications for Practice(Operational Next Steps) and Recommendations for Future Research (Academic Next Steps))

Is the preprint likely to advance academic knowledge? Highly likely Yes, the preprint is likely to advance academic knowledge because it systematically addresses a significant gap regarding the quality, readability, and reliability of autism information provided by AI tools, a domain where research is currently sparse.

Would it benefit from language editing? Yes for formal submission to a publisher or broader audience, the preprint would benefit from language editing to ensure maximum academic conciseness and flow, despite the fact that the core findings are already clearly and rigorously presented

Would you recommend this preprint to others? Yes, it's of high quality

Is it ready for attention from an editor, publisher or broader audience? Yes, after minor changes 1. Language Editing: The preprint requires professional refinement to improve grammatical consistency, clarify awkward phrasing, and ensure optimal academic conciseness and flow [Conversation History]. 2. Peer Review and Validation: For attention from a publisher or a broader audience seeking validated results, the preprint needs to successfully complete the peer review process, as it is currently the "Not peer-reviewed version" of the research output

Competing interests

The author declares that they have no competing interests.

Use of Artificial Intelligence (AI)

The author declares that they did not use generative AI to come up with new ideas for their review.

Read the original source
PREreview
Oct 7, 2025

This Zenodo record is a permanently preserved version of a Structured PREreview. You can view the complete PREreview at https://prereview.org/reviews/17285109.

Does the introduction explain the objective of the research presented in the preprint? Yes The introduction points out why the research has to be done. It tells why the set of research questions came into consideration originally. It gives an overview of the complete research done including the results of the tests performed and the conclusion. Thus, telling about the current trends of using AI to answer Autism questions, it tells about the pitfalls of it and why the research was necessary to be done.

Are the methods well-suited for this …

This Zenodo record is a permanently preserved version of a Structured PREreview. You can view the complete PREreview at https://prereview.org/reviews/17285109.

Does the introduction explain the objective of the research presented in the preprint? Yes The introduction points out why the research has to be done. It tells why the set of research questions came into consideration originally. It gives an overview of the complete research done including the results of the tests performed and the conclusion. Thus, telling about the current trends of using AI to answer Autism questions, it tells about the pitfalls of it and why the research was necessary to be done.

Are the methods well-suited for this research? Highly appropriate The methods are well suited for the research, keeping in mind all the parameters required so that the research questions can be well answered. Data Analysis done using a combination of mixed design ANOVAs and Multi level modeling to evaluate the effects of the LLMs was highly resourceful. Including the different languages for basis of input of questions was important. And checking the responses by different researchers was crucial to avoid bias errors. Thus, efficient way of carrying out the research.

Are the conclusions supported by the data? Highly supported The conclusion is thorough and support the data, as according to the data, the advantages of Chatgpt over Gemini and Copilot are mentioned. But, also the drawbacks of each of the AI models is mentioned accepting the data results. Thus, proving that the research questions were answered and how the data results help in coming to a fair conclusion of the impact of the research on further practice of doctors. All this is mentioned in the conclusion, not exaggerating too much which is not mentioned in the data.

Are the data presentations, including visualizations, well-suited to represent the data? Highly appropriate and clear The data presentations are well suited, as tables are included to explain the variables used in the research and their interrelation.

How clearly do the authors discuss, explain, and interpret their findings and potential next steps for the research? Very clearly The authors have clearly explained the results of the research, the advantages of Chatgpt over Gemini and Copilot after doing the thorough analysis. But, also mentioned the drawbacks of all three AI models, as I Mentioned earlier. After concluding the drawbacks, it is mentioned how the findings can be helpful for doctors to explain patients on how AI should be used to answer health related questions. And, how AI still has gaps in answering those questions.

Is the preprint likely to advance academic knowledge? Somewhat likely The preprint only confirms the findings of the research done, but still there are a lot of unanswered questions about AI and its gaps, and if any improvements can be done on the models.

Would it benefit from language editing? No I think the language of the preprint is well said, no impact is seen on understanding the context of the preprint.

Would you recommend this preprint to others? Yes, it's of high quality Yes, the preprint is really helpful as a general Insight for others on AI models and how they work in answering Health care questions.

Is it ready for attention from an editor, publisher or broader audience? Yes, as it is I don't think any changes are necessary as such. So, can be reviewed by an editor.

Competing interests

The author declares that they have no competing interests.

Use of Artificial Intelligence (AI)

The author declares that they did not use generative AI to come up with new ideas for their review.

Read the original source
Version published to 10.20944/preprints202509.1944.v1
Sep 23, 2025

Evaluating Information Quality in French-Language TikTok Videos on Autism, Attention-Deficit/Hyperactivity Disorder, and Dyslexia

This article has 7 authors:
1. Laurent Cordonier
2. Erell Guégan
3. Gérald Bronner
4. Julie Belembert
5. Sara Bahadori
6. Fanny Gollier-Briant
7. Hélène Vulser
This article has no evaluationsLatest version Oct 23, 2025
Evaluating Information Quality in French-Language TikTok Videos on Autism, Attention-Deficit/Hyperactivity Disorder, and Dyslexia

This article has 7 authors:
1. Laurent Cordonier
2. Erell Guégan
3. Gérald Bronner
4. Julie Belembert
5. Sara Bahadori
6. Fanny Gollier-Briant
7. Hélène Vulser
This article has no evaluationsLatest version Oct 23, 2025
Conceptual Proposal for a Computational Platform to Assist in the Learning and Cognitive Development Process of Children with Autism Spectrum Disorder: A Solution Based on a Multicriteria Structure

This article has 9 authors:
1. David de Oliveira Costa
2. Cleyton Mário de Oliveira Rodrigues
3. Andrei Bonamigo
4. Ana Claudia Souza
5. Carlo Marcelo Revoredo da Silva
6. Marcos Dos Santos
7. Carlos Francisco Simões Gomes
8. Miguel Ângelo Lellis Moreira
9. Daniel Augusto de Moura Pereira
This article has no evaluationsLatest version Oct 15, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Competing interests

Use of Artificial Intelligence (AI)

Competing interests

Use of Artificial Intelligence (AI)

Related articles

Evaluating Information Quality in French-Language TikTok Videos on Autism, Attention-Deficit/Hyperactivity Disorder, and Dyslexia

Evaluating Information Quality in French-Language TikTok Videos on Autism, Attention-Deficit/Hyperactivity Disorder, and Dyslexia

Conceptual Proposal for a Computational Platform to Assist in the Learning and Cognitive Development Process of Children with Autism Spectrum Disorder: A Solution Based on a Multicriteria Structure