Dual-BERT Adversarial Model for Text Normalization in Hausa User-Generated Contents

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This paper presents an innovative Dual-BERT Generative Adversarial Networks framework aimed at improving text normalisation in low-resource languages, specifically Hausa. By harnessing the capabilities of Bidirectional Encoder Representations from Transformers (BERT) and Generative Adversarial Networks (GANs), the model surpasses conventional Transformer-based and standalone GAN models in Exact Match, Word Error Rate (WER), Character Error Rate (CER), and BLEU score. Experimental results indicate an Exact Match of 0.80 and a notable decrease in error rates across all metrics. This methodology enhances NLP tools for under-represented languages, particularly within noisy, informal textual contexts such as social media.

Article activity feed