A machine learning model to support the screening for methods guidance articles in MEDLINE: A performance evaluation of ASReview simulation mode

Wael Abdelkader
Daniel Xie
Cynthia Lokker
Lingyang Chu
Stefan Schandelmaier
Ashirbani Saha
Muhammad Afzal
Alfonso Iorio

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Advances in clinical research methods are frequently published in biomedical journals, but identifying these articles remains challenging due to their rapid growth and insufficient indexing in biomedical databases. These challenges hinder the curation of methodologically focused resources like the Library of Guidance for Health Scientists (LIGHTS). Traditional screening approaches, such as Boolean search strategies and manual abstract screening, are inefficient and resource-intensive, limiting the feasibility of regularly updating LIGHTS. Machine learning (ML), particularly active learning models, presents a promising solution to improve the efficiency of article screening.

Objectives

This study evaluates the performance of ASReview’s active learning feature in identifying relevant methods guidance articles using pre-labeled data in simulation mode.

Methods

Using pre-labeled dataset composed of 1500 methods guidance articles and 20000 clinical studies, categorized as relevant or irrelevant, we trained and compared multiple simulation models in ASReview using various classifiers and feature extraction models. These included combinations of Support Vector Machine (SVM), Naïve Bayes (NB), Neural Network with Sentence BERT (sBERT), Doc2Vec, and TF-IDF. Model performance was evaluated based on screening burden, recall, Work Saved over Sampling (WSS), and precision. All model combinations used maximum query and dynamic double sampling settings.

Results

At 95-99.5% recall, SVM with TF-IDF required the fewest screened records (6.87-7.66% burden), while SVM with Doc2Vec achieved the best overall performance at 100% recall with only 11.47% screening burden (WSS@100 = 88.5%) in 42 minutes. Models using sBERT for feature extraction performed comparably through 99.5% recall but exhibited severe performance degradation at 100% recall, requiring screening of over 65% of the corpus.

Conclusion

Classical feature extraction methods, TF-IDF and Doc2Vec, paired with SVM outperform deep learning embeddings methods. ASReview in this controlled setting is a feasible tool for screening methodological literature. Future work should include prospective, human-in-the-loop experiments that embed the Doc2Vec-based SVM pipeline in comparison to human screening.

Version published to 10.1101/2025.10.13.25337935 on medRxiv
Oct 15, 2025

MedRAGent: An Automatic Literature Retrieval and Screening System Utilizing Large Language Models with Retrieval-Augmented Generation

This article has 7 authors:
1. Zhuoyi Chen
2. Tianyi Liu
3. Yangrui Mo
4. Qishen Fu
5. Sibin Lei
6. Tiejun Tong
7. Xiaoyu Tang
This article has no evaluationsLatest version Sep 19, 2025
To Include or Not to Include? A prescription from the pharmacy on how to use active learning assisted screening in systematic reviews

This article has 10 authors:
1. Rinus G Verdonschot
2. Tinne Dilles
3. Caitriona Cahir
4. Marjan De Graef
5. Renata Vesela Holis
6. Juliane Frydenlund
7. Petra Denig
8. Tamasine Grimes
9. Fatma G Karapinar-Carkit
10. Marieke Schor
This article has no evaluationsLatest version Sep 28, 2025
Application of Machine Learning in Hypertension Research and Management in Nigeria: A Systematic Review

This article has 4 authors:
1. Godswill Uzoechina
2. Chukwuemelie Obidike
3. Treasure Osajiuba
4. Jotham Nwabuobi
This article has no evaluationsLatest version Nov 5, 2025

Discuss this preprint

Listed in

Abstract

Background

Objectives

Methods

Results

Conclusion

Article activity feed

Related articles

MedRAGent: An Automatic Literature Retrieval and Screening System Utilizing Large Language Models with Retrieval-Augmented Generation

To Include or Not to Include? A prescription from the pharmacy on how to use active learning assisted screening in systematic reviews

Application of Machine Learning in Hypertension Research and Management in Nigeria: A Systematic Review