Artificial Intelligence’s Contribution to Biomedical Literature Search: Revolutionizing or Complicating?
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
There is a growing number of articles about conversational AI (i.e., ChatGPT) for generating scientific literature reviews and summaries. Yet, comparative evidence lags its wide adoption by many clinicians and researchers. We explored ChatGPT’s utility for literature search from an end-user perspective through the lens of clinicians and biomedical researchers. We quantitatively compared basic versions of ChatGPT’s utility against conventional search methods such as Google and PubMed. We further tested whether ChatGPT user-support tools (i.e., plugins, web-browsing function, prompt-engineering, and custom-GPTs) could improve its response across four common and practical literature search scenarios: (1) high-interest topics with an abundance of information, (2) niche topics with limited information, (3) scientific hypothesis generation, and (4) for newly emerging clinical practices questions. Our results demonstrated that basic ChatGPT functions had limitations in consistency, accuracy, and relevancy. User-support tools showed improvements, but the limitations persisted. Interestingly, each literature search scenario posed different challenges: an abundance of secondary information sources in high interest topics, and uncompelling literatures for new/niche topics. This study tested practical examples highlighting both the potential and the pitfalls of integrating conversational AI into literature search processes, and underscores the necessity for rigorous comparative assessments of AI tools in scientific research.
Author Summary
As generative Artificial Intelligence (AI) tools become increasingly functional, the promise of this technology is creating a wave of excitement and anticipation around the globe including the wider scientific and biomedical community. Despite this growing excitement, researchers seeking robust, reliable, reproducible, and peer-reviewed findings have raised concerns about AI’s current limitations, particularly in spreading and promoting misinformation. This emphasizes the need for continued discussions on how to appropriately employ AI to streamline the current research practices. We, as members of the scientific community and also end-users of conversational AI tools, seek to explore practical incorporations of AI for streamlining research practices. Here, we probed text-based research tasks—scientific literature mining— can be outsourced to ChatGPT and to what extent human adjudication might be necessary. We tested different models of ChatGPT as well as augmentations such as plugins and custom GPT under different contexts of biomedical literature searching. Our results show that though at present, ChatGPT does not meet the level of reliability needed for it to be widely adopted for scientific literature searching. However, as conversational AI tools rapidly advance (a trend highlighted by the development of augmentations in this article), we envision a time when ChatGPT can become a great time saver for literature searches and make scientific information easily accessible.