Dynamic Uncertainty-Aware Pseudo-Labeling with Ensemble Reweighting for Robust Semi-Supervised Text Classification

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Semi-supervised text classification is essential in settings where labeled data are limited but unlabeled data are abundant. Although pre-trained language models perform well in classification tasks, their effectiveness still depends on sufficient annotated examples. Many existing semi-supervised approaches struggle with unreliable pseudo-labels, limited use of uncertainty, and rigid ensemble strategies. To address these issues, we introduce Dynamic Uncertainty-aware Pseudo-Labeling with Ensemble Reweighting (DURPEL), an algorithm designed to improve reliability and robustness in semi-supervised learning, particularly under class imbalance. DURPEL incorporates an ensemble of independently trained BERT-Base student models, combining entropy-based uncertainty estimation, confidence-adaptive pseudo-labeling, and weighted ensemble voting. It further employs an adaptive reweighting mechanism that adjusts the learning importance of unlabeled samples based on model consistency, uncertainty, and historical difficulty, allowing the model to focus on informative cases. Experiments on the USB benchmark show consistent gains over existing methods, and ablation studies highlight the complementary strengths of DURPEL's components. The results demonstrate that DURPEL offers a stable and effective solution for semi-supervised text classification.

Article activity feed