The Perfect AI Research Assistant: Are ScholarGPT’s References More Reliable than ChatGPT’s?

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction: Artificial Intelligence has made a big impact in surgery, particularly in surgical academia. ChatGPT’s role in research has been widely studied but has received scepticism for generating inaccurate and non-existent references. ScholarGPT has been introduced to address these concerns, but its reliability has yet to be assessed. This study therefore aims to evaluate the accuracy of orthopaedic and plastic surgery references generated by ScholarGPT and compare its performance with that of ChatGPT. Methods: References were collected and assessed in a systematic manner. Each model was asked to generate 50 references, 40 for orthopaedics and 10 for plastic surgery. References were collected over 5 rounds, with each generating 10 references. Each reference was manually verified based on its existence, the accuracy of its PubMed ID and the validity of its DOI.The overall accuracy rates for each model were then calculated. Results: ScholarGPT demonstrated a 100% accuracy rate, generating 50 verifiable references with valid PubMed IDs and DOIs. ChatGPT demonstrated a 42% accuracy rate. No non-existent references were generated by either model. ScholarGPT also provided a more diverse range of references whereas ChatGPT’s references were non-specific. Conclusion: ScholarGPT outperformed ChatGPT and showed itself to be capable of providing reliable evidence for research outputs. Further research should explore its applications across other clinical specialities to validate its effectiveness. Integrating ScholarGPT into writing research articles should also be studied.

Article activity feed