More Than a Model: The Compounding Impact of Behavioral Ambiguity and Task Complexity on Hate Speech Detection

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The automated detection of hate speech is a critical but difficult task due to its subjective, behavior driven nature, which leads to frequent annotator disagreement. While advanced models (e.g., transformers) are state-of-the-art, it is unclear how their performance is affected by the methodological choice of label aggregation (e.g., ‘majority vote’ vs. ‘unanimous agreement’) and task complexity. We conduct a 2x2 quasi-experimental study to measure the compounding impact of these two factors: Labeling Strategy (low-ambiguity ‘Pure’ data vs. high-ambiguity ‘Majority’ data) and Task Granularity (Binary vs. Multi-class). We evaluate five models (Logistic Regression, Random Forest, LightGBM, GRU, and ALBERT) across four quadrants derived from the HateXplain dataset. We find that (1) ALBERTisthetop-performing modelinall conditions, achieving its peak F1-Score (0.8165) on the ‘Pure’ multi-class task. (2) Label ambiguity is the primary driver of performance loss; ALBERT’s F1-Score drops by ≈15.6% (from 0.8165 to 0.6894) when trained on noisy ‘Majority’ data in the multi-class setting. (3) This negative effect is compounded by task complexity, with the performance drop being nearly twice as severe for the multi-class task as for the binary task. A sensitivity analysis confirmed this drop is attributable to data quality (noise), not sample size. We conclude that behavioral label ambiguity is a more significant bottleneck to mode performance than model architecture, providing strong evidence for a data-centric approach.

Article activity feed