EEE-MT: Enhanced Entity Encoding for Low-Resource Machine Translation

Xiaocong Wang
Ying Li
Shengxiang Gao
Yuxin Huang
Zhengtao Yu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Low-resource language translation faces significant challenges in entity coverage and contextual information utilization, leading to high entity translation error rates. To address these issues, this paper proposes an entity-enhanced encoding mechanism by integrating an entity-enhanced encoder into the Transformer architecture to strengthen the model's ability to represent entities with fine-grained differentiation. The mechanism leverages the synergistic interaction between the self-attention mechanism and the enhanced encoder, coupled with a masking strategy, to comprehensively leverages contextual information for entities. Furthermore, a multi-task joint optimization framework is introduced, incorporating entity prediction losses at both the encoder and decoder to enhance the model's sensitivity to entity-related features. We demonstrate the effectiveness of EEE-MT with the experiments on machine translation tasks of IWSLT14 English-German, IWSLT15 English-Vietnamese, IWSLT17 English-French and English-Chinese, EEE-MT method achieves an improvement of 2.37–3.84 BLEU score on the four tasks. In low-resource translation tasks such as Lao-Chinese and Myanmar-Chinese, the translation performance of EEE-MT surpasses that of mainstream large language models.

Version published to 10.21203/rs.3.rs-7969437/v1 on Research Square
Nov 12, 2025

A Hybrid Machine Translation Framework for Low-Resource Indian Languages Using Differential Programming Loss Optimization

This article has 4 authors:
1. Rituraj Dixit
2. Sarabjeet Singh Bedi
3. Ibrahim Aljubayri
4. Mohammad Zubair Khan
This article has no evaluationsLatest version Oct 1, 2025
Unified Transformer Framework for Integrated Language -Vision Understanding and Content Generation

This article has 2 authors:
1. Anuj Attri
2. HariOm .
This article has no evaluationsLatest version Nov 4, 2025
Improving Large Language Models with Concept-Aware Fine-Tuning

This article has 5 authors:
1. Dacheng Tao
2. Michael Chen
3. Xikun ZHANG
4. Jiaxing Huang
5. Yingjie Wang
This article has no evaluationsLatest version Oct 1, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Hybrid Machine Translation Framework for Low-Resource Indian Languages Using Differential Programming Loss Optimization

Unified Transformer Framework for Integrated Language -Vision Understanding and Content Generation

Improving Large Language Models with Concept-Aware Fine-Tuning