A Hybrid Machine Translation Framework for Low-Resource Indian Languages Using Differential Programming Loss Optimization

Rituraj Dixit
Sarabjeet Singh Bedi
Ibrahim Aljubayri
Mohammad Zubair Khan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper proposes a hybrid machine translation (MT) framework for low-resource Indian languages by integrating an Iterative Data Merger (IDM), Synthetic Data Generation (SDG), and Differential Programming Loss Optimization (DPLO). The framework is evaluated on English→Bhojpuri and English→Punjabi translation tasks, with experiments conducted across legal, financial, and multidomain corpora. Results show that the proposed model consistently outperforms baseline systems and partial configurations, achieving improvements of up to + 2.87% BLEU, + 3.33% METEOR, and + 3.00% RIBES over the baseline. Domain-specific analysis reveals that financial texts yield higher translation quality compared to legal texts due to reduced terminological complexity, while cross-lingual comparisons demonstrate that Bhojpuri benefits more from resource availability and script alignment with Hindi than Punjabi. Ablation studies confirm the complementary impact of IDM, SDG, and DPLO, with the full model delivering the strongest overall performance. These findings highlight the effectiveness of the proposed approach for domain-adapted translation in low-resource settings and underscore its potential for scaling to other Indian languages.

Version published to 10.21203/rs.3.rs-7582145/v1 on Research Square
Oct 1, 2025

Variability in Low-Resource Machine Translation Evaluation: Authentic vs. LLM-Generated Training Corpora

This article has 3 authors:
1. Sofía García González¹
2. German Rigau Claramunt²
3. Jose Ramom Pichel Campos
This article has no evaluationsLatest version Jan 21, 2026
Construction of a cross-domain machime translation model based on meta-learing and semlantic transfer

This article has 1 author:
1. Yongjian Wang
This article has no evaluationsLatest version Jan 6, 2026
Neural Machine Translation and Multilingual NLP: A Survey of Methods, Architectures, and Applications

This article has 3 authors:
1. Yao Yuna
2. Junhao Song
3. Jing Qiao
This article has no evaluationsLatest version Jan 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Variability in Low-Resource Machine Translation Evaluation: Authentic vs. LLM-Generated Training Corpora

Construction of a cross-domain machime translation model based on meta-learing and semlantic transfer

Neural Machine Translation and Multilingual NLP: A Survey of Methods, Architectures, and Applications