Performance Analysis of RoBERTa in Detecting Sexism in Online Comments

Jvalaj Pandey
Adyatan Dagar
Anthony Martini

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Detecting and mitigating sexist language has become a critical issue in digital communication. While human experts can identify nuanced forms of sexism, the growing volume of online content makes manual detection impractical. This study compares four machine learning approaches for automated sexism detection: trigram frequency models, text vectorization techniques, convolutional neural networks (CNN), and RoBERTa, a transformer-based model. Traditional methods like trigram analysis and text vectorization are useful for identifying basic patterns but struggle to capture the contextual and semantic nuances inherent in sexist language. In contrast, more advanced models, such as CNNs and RoBERTa, leverage deeper understanding of language structure and context. Using a publicly available dataset, we evaluate the performance of these models based on accuracy, precision, recall, and F1-score. Our findings reveal that while trigram analysis and text vectorization provide some insights, RoBERTa consistently outperforms the other models by capturing the subtleties of sexist language and providing more accurate and reliable results. This research not only improves the technical methodologies for sexism detection but also contributes to the development of scalable, automated moderation tools that can address harmful linguistic patterns in real-time, promoting safer and more inclusive online environments.

Version published to 10.21203/rs.3.rs-6952088/v1 on Research Square
Jun 24, 2025

Sarcasm Detection with Contextual-Representation Multihop-Attention network

This article has 7 authors:
1. Yufeng Diao
2. Xueqian Su
3. Shiqi Li
4. Hao Zhang
5. Xiaochao Fan
6. Xiaoyu Liu
7. Kai Xie
This article has no evaluationsLatest version Jun 17, 2025
BCGH-Net: A Hierarchical Neural Framework for Fake News Detection Using BERT and Attention Fusion

This article has 4 authors:
1. Hussein Al-kaabi
2. Fuqdan Al-ibrahimi
3. Ali kadhim jasim
4. Zainab S. Idan
This article has no evaluationsLatest version Jul 24, 2025
Somali Dialect Identification: A Low-Resource Benchmark for MAXAA TIRI and MAAY Using Machine and Deep Learning

This article has 5 authors:
1. Abdifatah Ahmed Gedi
2. Yusuf Mohamed Ahmed
3. Shafie Abdi Mohamed
4. Yusuf Ahmed Yusuf
5. Abdénuur Umur Ebdiyow
This article has no evaluationsLatest version Jul 22, 2025

Listed in

Abstract

Article activity feed

Related articles

Sarcasm Detection with Contextual-Representation Multihop-Attention network

BCGH-Net: A Hierarchical Neural Framework for Fake News Detection Using BERT and Attention Fusion

Somali Dialect Identification: A Low-Resource Benchmark for MAXAA TIRI and MAAY Using Machine and Deep Learning