Accelerating Disease Model Parameter Extraction: An LLM-based Ranking Approach to Select Initial Studies For Literature Review Automation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
As climate change transforms our environment and human intrusion into natural ecosystems escalates, there is a growing demand for disease spread models to forecast and plan for the next zoonotic disease outbreak. Accurate parametrization of these models requires data from diverse sources, including scientific literature. Despite the abundance of scientific publications, the manual extraction of this data via systematic literature reviews remains a significant bottleneck, requiring extensive time and resources, and is susceptible to human error. To address this challenge, a novel automated parameter extraction framework, CliZod is presented. A crucial stage of the automation process is to screen scientific articles for relevance. This paper focuses on leveraging large language models (LLMs) to assist in the initial selection and ranking of primary studies, which then serve as training examples for the screening stage. By framing the selection criteria of articles as a question-answer task and utilising zero-shot chain-of-thought prompting, the proposed method achieves a saving of at least 60% work effort compared to manual screening at a recall level of 95% (NWSS@95%). This was validated across four datasets containing four distinct zoonotic diseases and a critical climate variable (rainfall). The approach additionally produces explainable AI rationales for each ranked article. The effectiveness of the approach across multiple diseases demonstrates the potential for broad application in systematic literature reviews. The substantial reduction in screening effort, along with the provision of explainable AI rationales, marks an important step toward automated parameter extraction from scientific literature.