Performance of Large Language Artificial Intelligence Models on Clear Aligner Treatment: Evaluation of Accuracy and Readibility in Answering Questions for Patients

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective This study aims to scientifically evaluate the responses provided by ChatGPT 4o to patient-posed questions regarding clear aligner treatment and to investigate the impact of artificial intelligence on medical processes. Materials and Methods A survey of 200 patients collected questions about clear aligner treatment, identifying 29 frequently asked ones. Researchers added 14 additional questions, resulting in 43, categorized into usage, side effects/complications, treatment limitations, and others. These questions were posed to chatbot simultaneously. The chatbot’s responses were analyzed for narrative level, readability, informational accuracy, and plagiarism using scoring systems. The data underwent statistical analysis with SPSS V.22. Results The highest FKGL score was observed in usage-related questions (14.53 ± 8.88), while the highest FRES score was in the other group (54.30 ± 0.89). DISCERN, EQIP, and GQS scores were highest in the usage and side effect/complication groups. Correlation analysis showed the strongest correlation between similarity and FKGL (r= -0.391, p = 0.010). For the top 10 questions, the average FKGL was 13.37 ± 4.37, and the average FRES was 47.18 ± 23.69. Regression analysis identified no significant relationships. Conclusion ChatGPT 4o demonstrates potential as a supplementary tool in patient education under professional supervision. However, further research is required to optimize its application and ensure its reliability in clinical settings.

Article activity feed