Comprehensive Smart Data Augmentation withMultiple Transformer Models for ImbalancedSentiment Classification

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Class imbalance remains a persistent challenge in sentiment analysis, often causing models to favor majority classes whilefailing to capture minority opinions. This study investigates this issue using a real user feedback food-court review datasetand evaluates whether a diverse augmentation pipeline can mitigate its effects. The proposed approach integrates multipleaugmentation strategies to generate new, sentiment-consistent training instances instead of replicating existing samples.Incorporating these augmented examples into BERT-based classifiers resulted in notable gains: accuracy increased from81.70% to 90.64%, the weighted F1-score improved from 0.788 to 0.907, and major improvement in the negative recall,increasing from a baseline of 0.243 to 0.892. Highest relative gain in overall accuracy (12.64%), was reported for the ALBERTmodel i.e., improvement from 77.44% to 87.23%. The datasets were tested on other model and the consistent performancewas achieved for all other model using the smart augmentation approached, Thus, enabling the model to generalize moreeffectively and recognize positive, neutral, and negative sentiments with greater reliability. The findings demonstrate that amulti-strategy, context-aware augmentation process can substantially enhance transformer-based sentiment classificationunder imbalanced data conditions. These findings establish a critical efficiency-accuracy frontier for deploying deep learningmodels in real-world environments. Smart Augmentation is essential for maximizing the potential of transformer architectures insentiment analysis. It not only boosts the top-line accuracy for models like BERT by over 10% but also transforms the model’sreliability by nearly tripling its sensitivity to the negative class.

Article activity feed