EEE-MT: Enhanced Entity Encoding for Low-Resource Machine Translation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Low-resource language translation faces significant challenges in entity coverage and contextual information utilization, leading to high entity translation error rates. To address these issues, this paper proposes an entity-enhanced encoding mechanism by integrating an entity-enhanced encoder into the Transformer architecture to strengthen the model's ability to represent entities with fine-grained differentiation. The mechanism leverages the synergistic interaction between the self-attention mechanism and the enhanced encoder, coupled with a masking strategy, to comprehensively leverages contextual information for entities. Furthermore, a multi-task joint optimization framework is introduced, incorporating entity prediction losses at both the encoder and decoder to enhance the model's sensitivity to entity-related features. We demonstrate the effectiveness of EEE-MT with the experiments on machine translation tasks of IWSLT14 English-German, IWSLT15 English-Vietnamese, IWSLT17 English-French and English-Chinese, EEE-MT method achieves an improvement of 2.37–3.84 BLEU score on the four tasks. In low-resource translation tasks such as Lao-Chinese and Myanmar-Chinese, the translation performance of EEE-MT surpasses that of mainstream large language models.