EEE-MT: Enhanced Entity Encoding for Low-Resource Machine Translation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Low-resource language translation faces significant challenges in entity coverage and contextual information utilization, leading to high entity translation error rates. To address these issues, this paper proposes an entity-enhanced encoding mechanism by integrating an entity-enhanced encoder into the Transformer architecture to strengthen the model's ability to represent entities with fine-grained differentiation. The mechanism leverages the synergistic interaction between the self-attention mechanism and the enhanced encoder, coupled with a masking strategy, to comprehensively leverages contextual information for entities. Furthermore, a multi-task joint optimization framework is introduced, incorporating entity prediction losses at both the encoder and decoder to enhance the model's sensitivity to entity-related features. We demonstrate the effectiveness of EEE-MT with the experiments on machine translation tasks of IWSLT14 English-German, IWSLT15 English-Vietnamese, IWSLT17 English-French and English-Chinese, EEE-MT method achieves an improvement of 2.37–3.84 BLEU score on the four tasks. In low-resource translation tasks such as Lao-Chinese and Myanmar-Chinese, the translation performance of EEE-MT surpasses that of mainstream large language models.

Article activity feed