Reinforcement Learning for the Computational Interpretation of Classical Medical Heritage Texts

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Traditional Chinese Medicine (TCM) classics are a major form of intangible heritage, preserving historically layered medical knowledge, diagnostic logic, and therapeutic epistemologies. For heritage-text digitisation, interpretive fidelity and epistemological continuity are as critical as linguistic fluency. We present R1-TCM-Translator, a heritage-oriented framework for ancient-to-modern Chinese medical translation that combines multi-objective reinforcement learning (GRPO) with a structured six-step reasoning process to make cultural-epistemic reconstruction explicit and auditable. Experiments on a philologically curated parallel corpus of 15,387 sentence pairs from eight representative TCM classics show consistent improvements over supervised fine-tuning baselines and strong general-purpose large models. R1-TCM-Translator-8B demonstrates consistent gains on lexical-alignment and semantic-consistency metrics, specifically BLEU and COMET, indicating improved cross-text interpretive stability rather than metric-specific optimisation alone. Fine-grained analyses further show an approximately 23-percentage-point increase in specialised terminology accuracy and improved consistency in interpreting complex pathogenesis-related semantics. These findings suggest that reward-guided structured reasoning can improve epistemic fidelity in digitised medical heritage archives beyond surface-form translation quality. By embedding semantic-fidelity objectives directly into optimisation, the framework operationalises heritage-aware translation as an auditable alignment process rather than a purely generative task. While doctrine-dense passages and composite interpretive risks remain challenging, the framework provides a reproducible computational pathway for digital preservation, knowledge modelling, and digital-humanities research on medical heritage texts.

Article activity feed