The Babel Effect: Analyzing Multilingual Performane Discrepancies in Large Language Models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large Language Models (LLMs) like GPT-4 and mBERT have revolutionized natural languageprocessing (NLP) by providing multilingual capabilities, making it possible to develop models thathandle diverse linguistic inputs across various languages. However, despite these advances, thereremains a noticeable performance gap between how well these models perform in high-resourcelanguages such as English and low-resource languages such as Nepali or Malagasy. We term thisphenomenon the "Babel Effect," highlighting the disproportionate performance that arises fromdifferences in resource availability across languages.This paper aims to explore the root causes of these performance discrepancies in LLMs, focusingon the underlying challenges in tokenization, training, and data scarcity. We utilize cross-lingualbenchmarks, such as XGLUE and TyDiQA, to quantify these performance variations and examinethem in detail. Furthermore, we propose solutions, including enhancing tokenization strategies,employing data augmentation techniques, and refining fine-tuning methods. The paper concludeswith a discussion on how these improvements can mitigate the Babel Effect and lead to more equitablelanguage modeling across diverse linguistic contexts.