Towards Secure Social Platforms: Hate Speech Detection and Classification in Indian Languages Using Hybrid Soft Computing Techniques
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The widespread adoption of high-speed internet has fueled a surge in social media usage. However, the absence of robust regulations has allowed abusive and offensive content to proliferate on these platforms. Existing research predominantly focuses on English, overlooking the rich linguistic diversity of India. The difficulties of multilingualism and code-mixing have made it more difficult to identify hate speech in Indian languages, which has led to a lack of resources. For the purpose of detecting hate speech in Indian languages, traditional and deep learning techniques have been utilized despite these obstacles. For the purpose of identifying and classifying hate speech in Indian languages, we propose a novel strategy that makes use of hybrid soft computing methods to address these difficulties. Our model comprises three key processes: gathering meaningful information, feature extraction, and prediction. Initially, we leverage BERT for code conversion and the modified Meerkat optimization (MMO) algorithm for similarity checks to discern the nature of tweets. Subsequently, we employ UNet with the multi-color shark optimization (MCSO) algorithm for feature learning, facilitating the extraction and selection of optimal features from the gathered information. Additionally, we introduce the Bayesian tensorized neural network (BTNN) for classifying hate speech in Indian languages, including Tamil, Malayalam, Kannada, Hindi, Bengali, and Marathi. To evaluate the effectiveness of our method, we utilize publicly available datasets, DravidianCodeMix, Gold-standard, L3Cube, and HASOC 2020. The simulation results shows that the UNet + BTNN model consistently outperforms other models, achieving average accuracies of 98.452%, 97.856%, 98.154%, 97.579%, 96.898% and 98.565% for Tamil, Malayalam, Kannada, Hindi, Bengali, and Marathi, respectively.