Automating abstract screening in research synthesis using large language models: A tutorial and proof-of-concept study
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Screening abstracts is a crucial yet labor-intensive step in research synthesis projects such as systematic reviews and meta-analyses. Large language models (LLMs) promise an opportunity to streamline and automate this process. However, there is still little guidance and practical insight into how such automated workflows can be implemented in research practice. In this article, we propose, describe, and illustrate an abstract screening procedure using LLMs including a quantification of uncertainty of model outputs. We provide a step-by-step tutorial in which we address how to select an LLM, access LLMs from within R, develop and refine suitable prompts, define structured output formats, and whether and how model hyperparameters should be set. We then illustrate this procedure in a proof-of-concept study using a set of human-rated abstracts from a recent meta-analysis for comparison. We demonstrate that LLM-assisted screening can substantially reduce the time and cost of review preparation while maintaining accuracy comparable to human raters. To support psychological researchers in adopting LLM-based workflows for research synthesis, we also provide R scripts and implementation guidance on Gitlab. At the same time, we emphasize that this work represents an initial step, and that continued refinement and validation are essential as LLM technologies and their applications continue to evolve rapidly.