Resampling Methods for Class Imbalance in Clinical Prediction Models: A Systematic Review and Meta-Regression Protocol

Osama Abdelhay
Adam Shatnawi
Hassan Najadat
Taghreed Altamimi

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Introduction

Class imbalance—situations where clinically important “positive” cases form <30 % of the dataset—systematically degrades the sensitivity and fairness of medical prediction models. Although data-level techniques such as random oversampling, random undersampling and SMOTE, and algorithm-level approaches like cost-sensitive learning, are widely used, the empirical evidence describing when these corrections improve model performance remains fragmented across diseases and modelling frameworks. This protocol outlines a scoping systematic review with meta-regression that will map and quantitatively summarise 15 years of research on resampling strategies in imbalanced clinical datasets, addressing a critical methodological gap in trustworthy medical AI.

Methods and analysis

We will search MEDLINE, EMBASE, Scopus, Web of Science Core Collection and IEEE Xplore, plus grey-literature sources (medRxiv, arXiv, bioRxiv) for primary studies (2009 – 31 Dec 2024) that apply at least one resampling or cost-sensitive method to binary clinical prediction tasks with a minority-class prevalence <30 %. No language restrictions will be applied. Two reviewers will screen records, extract data with a piloted form and document the process in a PRISMA flow diagram. A descriptive synthesis will catalogue clinical domain, sample size, imbalance ratio, resampling technique, model type and performance metrics where≥10 studies report compatible AUCs, a random-effects mixed-effects meta-regression (logit-transformed AUC) will examine moderators including imbalance ratio, resampling class, model family and sample size. Small-study effects will be probed with funnel plots, Egger’s test, trim-and-fill and weight-function models; influence diagnostics and leave-one-out analyses will assess robustness. Because this is a methodological review, formal clinical risk-of-bias tools are optional; instead, design-level screening, influence diagnostics and sensitivity analyses will ensure transparency.

Discussion

By combining a broad conceptual map with quantitative estimates, this review will establish when data-level versus algorithm-level balancing yields genuine improvements in discrimination, calibration and cost-sensitive metrics across diverse medical domains. The findings will guide researchers in choosing parsimonious, evidence-based imbalance corrections, inform journal and regulatory reporting standards, and highlight research gaps, such as the under-reporting of calibration and misclassification costs, that must be addressed before balanced models can be trusted in clinical practice.

Systematic review registration

INPLASY202550026

Version published to 10.1101/2025.05.19.25327868v1 on medRxiv
May 20, 2025

Diagnostic Test Accuracy Meta-Analysis: A Practical Guide to Hierarchical Models

This article has 1 author:
1. Javier Arredondo Montero
This article has no evaluationsLatest version Jun 30, 2025
Interpretable Machine Learning for Mortality Risk Detection in National Health Data

This article has 4 authors:
1. J. CHA
2. E.D. CHA
3. E. Yoo
4. H. Song
This article has no evaluationsLatest version Jun 20, 2025
Large Language Models: A Generalizable and Interpretable Approach for Postoperative Risk Prediction in Elderly Surgical Patients (Motivated by the AKI Prediction Framework Study Using LLMs by Zhu et al.)

This article has 8 authors:
1. Ricardo Pietrobon
2. Aline Machiavelli
3. Luiza Paulsen Rodrigues
4. Amit Agrey
5. Lizzy Nkeangnyi
6. Giselle Zechia
7. Victor Galvão
8. Lucas Teixeira
This article has no evaluationsLatest version May 29, 2025

Listed in

Abstract

Introduction

Methods and analysis

Discussion

Systematic review registration

Article activity feed

Related articles

Diagnostic Test Accuracy Meta-Analysis: A Practical Guide to Hierarchical Models

Interpretable Machine Learning for Mortality Risk Detection in National Health Data

Large Language Models: A Generalizable and Interpretable Approach for Postoperative Risk Prediction in Elderly Surgical Patients (Motivated by the AKI Prediction Framework Study Using LLMs by Zhu et al.)