MedRAGent: An Automatic Literature Retrieval and Screening System Utilizing Large Language Models with Retrieval-Augmented Generation

Zhuoyi Chen
Tianyi Liu
Yangrui Mo
Qishen Fu
Sibin Lei
Tiejun Tong
Xiaoyu Tang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Systematic reviews play a critical role in synthesizing evidence across numerous studies, providing a foundation for informed decision-making in medical practice. However, the process is resource-intensive, requiring proficiency in constructing Boolean queries and screening extensive literature, which are time-consuming and susceptible to inconsistencies, especially for non-expert researchers. While large language models (LLMs) offer a potential solution, their tendency to generate inaccurate or hallucinated content restricts their direct application in systematic reviews.

Objective

This study introduces and evaluates MedRAGent, a novel system that integrates LLMs with retrieval-augmented generation (RAG), designed to automate and enhance the efficiency and accuracy of Boolean query formulation and title/abstract screening in systematic reviews.

Methods

MedRAGent employs DeepSeek-V3-0324 and Kimi-K2-0711-preview LLMs within an RAG framework tailored for PubMed. The system utilizes the official Medical Subject Headings (MeSH) database to construct precise Boolean queries. For screening, it employs the LLMs with a structured prompt to automatically evaluate the relevance of retrieved articles based on predefined inclusion and exclusion criteria. Its performance was assessed using 53,054 articles from 6 research topics.

Results

Our results showed that MedRAGent achieved an overall precision of 0.0271, recall of 0.8308, and F1-score of 0.0525 in Boolean query construction. For automated literature screening, the system attained an overall sensitivity of 0.8131, specificity of 0.9891, and G-mean of 0.8968 when using DeepSeek-V3-0324 as the underlying LLM. Performance improved when using Kimi-K2-0711-preview, with sensitivity of 0.8582, specificity of 0.9919, and G-mean of 0.9226. It efficiently processed 4,000-7,000 articles per day at low operational cost.

Conclusions

MedRAGent demonstrates strong potential for automating Boolean query construction and abstract-level screening in systematic reviews. It effectively accelerates literature processing, supporting researchers in conducting efficient and evidence-based medical reviews.

Version published to 10.1101/2025.09.18.25335860 on medRxiv
Sep 19, 2025

Automated Prediction of Radiological Protocols Using Retrieval Augmented Generation

This article has 10 authors:
1. Conrad T. Testagrose
2. Panagiotis Korfiatis
3. Timothy L. Kline
4. Justin D. Benfield
5. Cole J. Cook
6. Peggy S. Merkel
7. Mutlu Demirer
8. Richard D. White
9. Candice W. Bolan
10. Barbaros S. Erdal
This article has no evaluationsLatest version Sep 17, 2025
A machine learning model to support the screening for methods guidance articles in MEDLINE: A performance evaluation of ASReview simulation mode

This article has 8 authors:
1. Wael Abdelkader
2. Daniel Xie
3. Cynthia Lokker
4. Lingyang Chu
5. Stefan Schandelmaier
6. Ashirbani Saha
7. Muhammad Afzal
8. Alfonso Iorio
This article has no evaluationsLatest version Oct 15, 2025
Development of a RAG-based Expert LLM for Clinical Support in Radiation Oncology

This article has 6 authors:
1. Tingjun Liu
2. Xucheng Wang
3. Matthew Inkman
4. Julian C. Hong
5. Michael R. Waters
6. Jin Zhang
This article has no evaluationsLatest version Sep 18, 2025

Discuss this preprint

Listed in

Abstract

Background

Objective

Methods

Results

Conclusions

Article activity feed

Related articles

Automated Prediction of Radiological Protocols Using Retrieval Augmented Generation

A machine learning model to support the screening for methods guidance articles in MEDLINE: A performance evaluation of ASReview simulation mode

Development of a RAG-based Expert LLM for Clinical Support in Radiation Oncology