Utilizing ChatGPT to select literature for meta-analysis shows workload reduction while maintaining a similar recall level as manual curation

Read the full article See related articles

Listed in

  • AI (mark2d2)
Log in to save this article

Abstract

Background

Large language models (LLMs) like ChatGPT showed great potential in aiding medical research. A heavy workload in filtering records is needed during the research process of evidence-based medicine, especially meta-analysis. However, no study tried to use LLMs to help screen records in meta-analysis. In this research, we aimed to explore the possibility of incorporating ChatGPT to facilitate the screening step based on the title and abstract of records during meta-analysis.

Methods

To assess our strategy, we selected three meta-analyses from the literature, together with a glioma meta-analysis embedded in the study, as additional validation. For the automatic selection of records from curated meta-analyses, a four-step strategy called LARS was developed, consisting of (1) criteria selection and single-prompt (prompt with one criterion) creation, (2) best combination identification, (3) combined-prompt (prompt with one or more criteria) creation, and (4) request sending and answer summary. We evaluated the robustness of the response from ChatGPT with repeated requests. Recall, workload reduction, precision, and F1 score were calculated to assess the performance of LARS.

Findings

ChatGPT showed a stable response for repeated requests (robustness score: 0·747 – 0·996). A variable performance was found between different single-prompts with a mean recall of 0·841. Based on these single-prompts, we were able to find combinations with performance better than the pre-set threshold. Finally, with a best combination of criteria identified, LARS showed a 39·5% workload reduction on average with a recall greater than 0·9. In the glioma meta-analysis, we found no prognostic effect of CD8+ TIL on overall survival, progress-free survival, and survival time after immunotherapy.

Interpretation

We show here the groundbreaking finding that automatic selection of literature for meta-analysis is possible with ChatGPT. We provide it here as a pipeline, LARS, which showed a great workload reduction while maintaining a pre-set recall.

Funding

China Scholarship Council.

Article activity feed