More Than a Model: The Compounding Impact of Behavioral Ambiguity and Task Complexity on Hate Speech Detection
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The automated detection of hate speech is a critical but difficult task due to its subjective, behavior driven nature, which leads to frequent annotator disagreement. While advanced models (e.g., transformers) are state-of-the-art, it is unclear how their performance is affected by the methodological choice of label aggregation (e.g., ‘majority vote’ vs. ‘unanimous agreement’) and task complexity. We conduct a 2x2 quasi-experimental study to measure the compounding impact of these two factors: Labeling Strategy (low-ambiguity ‘Pure’ data vs. high-ambiguity ‘Majority’ data) and Task Granularity (Binary vs. Multi-class). We evaluate five models (Logistic Regression, Random Forest, LightGBM, GRU, and ALBERT) across four quadrants derived from the HateXplain dataset. We find that (1) ALBERTisthetop-performing modelinall conditions, achieving its peak F1-Score (0.8165) on the ‘Pure’ multi-class task. (2) Label ambiguity is the primary driver of performance loss; ALBERT’s F1-Score drops by ≈15.6% (from 0.8165 to 0.6894) when trained on noisy ‘Majority’ data in the multi-class setting. (3) This negative effect is compounded by task complexity, with the performance drop being nearly twice as severe for the multi-class task as for the binary task. A sensitivity analysis confirmed this drop is attributable to data quality (noise), not sample size. We conclude that behavioral label ambiguity is a more significant bottleneck to mode performance than model architecture, providing strong evidence for a data-centric approach.