An Investigation of Bias in Bangla Text Classification Models

Md Istiak Tanvir
Asma Akter
Tanvirul Islam

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rapid growth of natural language processing (NLP) applications has highlighted concerns about fairness and bias in text classification models. Despite significant advancements, the evaluation of bias and fairness in Bangla text classification remains underexplored. This study investigates model bias in Bangla text classification models, focusing on key fairness metrics such as Demographic Parity, Equalized Odds, and Accuracy Parity. We analyze the performance of widely used models, including Naive Bayes (NB), Support Vector Machine (SVM), Random Forest (RF), LSTM and Bangla-BERT, on a comprehensive dataset. The results reveal disparities in fairness across models, with Bangla-BERT achieving the highest fairness scores but still exhibiting measurable bias. To address this, we conduct an error analysis, highlighting the prevalence of bias-induced misclassifications across sensitive attributes. Additionally, we propose actionable recommendations to enhance fairness in Bangla NLP models, bridging gaps in ethical AI for low-resource languages. Our findings provide valuable insights for developing more equitable Bangla text classification systems and emphasize the need for fairness-aware methodologies in future NLP research.

Version published to 10.21203/rs.3.rs-6181907/v1 on Research Square
Mar 13, 2025

Somali Dialect Identification: A Low-Resource Benchmark for MAXAA TIRI and MAAY Using Machine and Deep Learning

This article has 5 authors:
1. Abdifatah Ahmed Gedi
2. Yusuf Mohamed Ahmed
3. Shafie Abdi Mohamed
4. Yusuf Ahmed Yusuf
5. Abdénuur Umur Ebdiyow
This article has no evaluationsLatest version Jul 22, 2025
BERT-Based Model for Identifying Hate Speech and Offensive Language in Arabic Social Media

This article has 3 authors:
1. Aiman M. Ayyal Awwad
2. Farhan Alebeisat
3. Ra’dah A. Alsmeheen
This article has no evaluationsLatest version Aug 26, 2025
On the Development of ToxicBias-Reasoning for Responsible Multicultural Bias Detection and Explanation

This article has 5 authors:
1. Anuj Kumar
2. Mahendra Kumar Gurve
3. Satyadev Ahlawat
4. Yamuna Prasad
5. Virendra Singh
This article has no evaluationsLatest version Sep 5, 2025

Listed in

Abstract

Article activity feed

Related articles

Somali Dialect Identification: A Low-Resource Benchmark for MAXAA TIRI and MAAY Using Machine and Deep Learning

BERT-Based Model for Identifying Hate Speech and Offensive Language in Arabic Social Media

On the Development of ToxicBias-Reasoning for Responsible Multicultural Bias Detection and Explanation