Leveraging Large Language Models for Data Extraction in Metaresearch
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The manual data extraction in metaresearch is often a tedious, time-consuming, and error-prone process. In this paper, we investigate whether the current generation of Large Language Models (LLMs) can be used to extract accurate information from scientific papers. Across the metaresearch literature, these usually range from extracting verbatim information (e.g., the number of participants in a study, effect sizes, or whether the study is preregistered) to making subjective inferences. Using a publicly available dataset (Blanchard et al., 2022) containing a wide range of meta-scientific variables from 34 network psychometrics papers, we tested six LLMs (Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku, GPT 4o, GPT 4o mini, o1-preview). We used the API for extracting the variables from the documents automatically. This automated pipeline allows batch processing of research papers. As such, it represents a more efficient and scaleable way to extract metascientific data, compared to using the default chat interface. Our results point to a high accuracy and high potential of LLMs for metascientific data extraction. The accuracy of the respective models ranged from 75 % to 90 %, and most models were able to convey uncertainty in the more contentious cases. We provide a comparison of accuracy and cost-effectiveness of the individual models and discuss the characteristics of variables that are (non)suitable for automatic coding. Furthermore, we describe some of the common pitfalls and best practices of automatised LLM data extraction. The proposed procedure can decrease the time and costs associated with conducting metaresearch by orders of magnitude.