Artificial intelligence’s contribution to biomedical literature search: revolutionizing or complicating?

Rui Yip
Young Joo Sun
Alexander G. Bassuk
Vinit B. Mahajan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

There is a growing number of articles about conversational AI (i.e., ChatGPT) for generating scientific literature reviews and summaries. Yet, comparative evidence lags its wide adoption by many clinicians and researchers. We explored ChatGPT’s utility for literature search from an end-user perspective through the lens of clinicians and biomedical researchers. We quantitatively compared basic versions of ChatGPT’s utility against conventional search methods such as Google and PubMed. We further tested whether ChatGPT user-support tools (i.e., plugins, web-browsing function, prompt-engineering, and custom-GPTs) could improve its response across four common and practical literature search scenarios: (1) high-interest topics with an abundance of information, (2) niche topics with limited information, (3) scientific hypothesis generation, and (4) for newly emerging clinical practices questions. Our results demonstrated that basic ChatGPT functions had limitations in consistency, accuracy, and relevancy. User-support tools showed improvements, but the limitations persisted. Interestingly, each literature search scenario posed different challenges: an abundance of secondary information sources in high interest topics, and uncompelling literatures for new/niche topics. This study tested practical examples highlighting both the potential and the pitfalls of integrating conversational AI into literature search processes, and underscores the necessity for rigorous comparative assessments of AI tools in scientific research.

Version published to 10.1371/journal.pdig.0000849
May 12, 2025
Version published to 10.1101/2024.10.07.617112 on bioRxiv
Oct 8, 2024

InterFeat: A Pipeline for Finding Interesting Scientific Features

This article has 3 authors:
1. Dan Ofer
2. Michal Linial
3. Dafna Shahaf
This article has no evaluationsLatest version Dec 1, 2025
Poetic or Prosaic? Evaluating the Linguistic Quality of AI-Generated Draft Replies to Patient Portal Messages

This article has 8 authors:
1. Gavin Hui
2. Laura Prichard
3. Taylor Martin
4. Sitaram Vangala
5. Joshua Khalili
6. Sun M. Yoo
7. Hawkin E. Woo
8. Paul J. Lukac
This article has no evaluationsLatest version Dec 11, 2025
Computational Review of Technology-Assisted Medical Evidence Synthesis through Human-LLM Collaboration: A Case Study of Cochrane

This article has 3 authors:
1. Yiping Ding
2. Xiaorui Jiang
3. Opeoluwa Akinseloyin
This article has no evaluationsLatest version Nov 10, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

InterFeat: A Pipeline for Finding Interesting Scientific Features

Poetic or Prosaic? Evaluating the Linguistic Quality of AI-Generated Draft Replies to Patient Portal Messages

Computational Review of Technology-Assisted Medical Evidence Synthesis through Human-LLM Collaboration: A Case Study of Cochrane