Language Twin: A Shared-State Architecture for Terminology-Consistent Document Translation with Human-Edit Propagation: A Pilot Study

Elliott SeokHyun Ahn

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large language model (LLM)-based document translation systems typically treat each segment independently, discarding terminology decisions, human corrections, and discourse cues after each generation step. This stateless approach causes terminology inconsistency across segments, failure to propagate approved post-edits downstream, and redundant prompt-token consumption. Existing solutions—document-level MT, retrieval-augmented generation, and computer-assisted translation (CAT) tools as a general category—address individual aspects but lack a unified, state-aware architecture with provenance, update rules, and rollback semantics. We propose Language Twin, a shared-state architecture that organizes translation projects into seven versioned layers (L0–L6), supporting selective context loading, scoped human-edit propagation, and reversible updates. A pilot study translated three curated English-to-Korean document bundles (17 segments) using GPT-4o with a temperature of 0.3. The Language Twin condition (P1) achieved numerically higher preferred-term accuracy than the strongest baseline (17/21 vs. 14/21; not statistically significant at this sample size) and showed no repeated downstream errors in the monitored set (0/5 vs. 5/5 against the propagation-disabled ablation; Fisher’s exact test: p = 0.008), while reducing prompt tokens by 39.2% relative to full-context loading (A4). In blinded human evaluation (quadratic-weighted κ = 0.71–0.78), P1 achieved the highest terminology rating (4.38/5 vs. 3.97/5) and lowest post-editing time (16.9 s vs. 19.1 s per segment). These pilot-scale results indicate that governed shared state can improve terminology consistency and editing efficiency.

Version published to 10.3390/app16083922
Apr 17, 2026
Version published to 10.20944/preprints202603.2397.v1
Mar 30, 2026

CrossLingBench: A Comprehensive Evaluation ofLarge Language Models on Multilingual NLPTasks Across Languages and Prompting Strategies

This article has 1 author:
1. Ahmed Cherif
This article has no evaluationsLatest version Apr 17, 2026
Building Neural Machine Translation for Garo: Corpus, Ablation, and Agentic Reranking

This article has 2 authors:
1. Badal Nyalang
2. Walmatchi Momin
This article has no evaluationsLatest version Apr 7, 2026
Augmenting Large Language Models with External Data Sources: A Systematic Review of Methodologies, Performance Metrics, and Information Fidelity

This article has 4 authors:
1. Soham Mukherjee
2. John Le
3. Chau Nguyen
4. Thai Vu
This article has no evaluationsLatest version Apr 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

CrossLingBench: A Comprehensive Evaluation ofLarge Language Models on Multilingual NLPTasks Across Languages and Prompting Strategies

Building Neural Machine Translation for Garo: Corpus, Ablation, and Agentic Reranking

Augmenting Large Language Models with External Data Sources: A Systematic Review of Methodologies, Performance Metrics, and Information Fidelity