Accelerating Systematic Reviews with Large Language Models: A Rapid Evidence-Based Synthesis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study explores the application of Large Language Models (LLMs) in systematic reviews, focusing on their performance, consistency, and potential for cost savings. By analyzing 44 studies, the research highlights that LLMs show moderate to high performance in title and abstract screening, full-text screening, and data extraction, but only low to moderate performance in literature search and quality assessment. The study finds that prompt design significantly impacts LLM performance, with Chain of Thought (CoT) prompts often enhancing outcomes. Consistency with human reviewers varies, with moderate to high agreement in certain stages but lower consistency in quality assessment. The research suggests that while LLMs cannot fully replace human reviewers, they serve as valuable assistants in systematic reviews, especially in reducing time and effort. The study also provides practical recommendations for integrating LLMs effectively and discusses the challenges and future research directions in this evolving field.