Reinforcement Learning for the Computational Interpretation of Classical Medical Heritage Texts

Si Xie
Wei Liu
Jueling Luo
Xiyue Song
Mei Ouyang
Wanjin Song
Wei Hu

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Traditional Chinese Medicine (TCM) classics are a major form of intangible heritage, preserving historically layered medical knowledge, diagnostic logic, and therapeutic epistemologies. For heritage-text digitisation, interpretive fidelity and epistemological continuity are as critical as linguistic fluency. We present R1-TCM-Translator, a heritage-oriented framework for ancient-to-modern Chinese medical translation that combines multi-objective reinforcement learning (GRPO) with a structured six-step reasoning process to make cultural-epistemic reconstruction explicit and auditable. Experiments on a philologically curated parallel corpus of 15,387 sentence pairs from eight representative TCM classics show consistent improvements over supervised fine-tuning baselines and strong general-purpose large models. R1-TCM-Translator-8B demonstrates consistent gains on lexical-alignment and semantic-consistency metrics, specifically BLEU and COMET, indicating improved cross-text interpretive stability rather than metric-specific optimisation alone. Fine-grained analyses further show an approximately 23-percentage-point increase in specialised terminology accuracy and improved consistency in interpreting complex pathogenesis-related semantics. These findings suggest that reward-guided structured reasoning can improve epistemic fidelity in digitised medical heritage archives beyond surface-form translation quality. By embedding semantic-fidelity objectives directly into optimisation, the framework operationalises heritage-aware translation as an auditable alignment process rather than a purely generative task. While doctrine-dense passages and composite interpretive risks remain challenging, the framework provides a reproducible computational pathway for digital preservation, knowledge modelling, and digital-humanities research on medical heritage texts.

Version published to 10.21203/rs.3.rs-9067668/v1 on Research Square
Mar 17, 2026

ForVA and GCM-CLIP: A Million-Scale Multimodal Dataset and Representation Learning Framework for Virtual Autopsy

This article has 8 authors:
1. Jing Cai
2. Jikai Mao
3. Nanze Du
4. Tu Lyu
5. Hao Li
6. Yi Shen
7. Liang Shen
8. Junjun Guo
This article has no evaluationsLatest version Mar 30, 2026
A Multimodal Large Reasoning Model For Fair and Interpretable Dermatological Diagnosis Across Skin Tones

This article has 17 authors:
1. Juexiao Zhou
2. Yuhao Shen
3. Zhangtianyi Chen
4. Yuanhao He
5. Yan Xu
6. Shuping Zhang
7. Liyuan Sun
8. Zijian Wang
9. Yinghao Zhu
10. Jiahe Qian
11. Yuyuan Yang
12. Ziwen Wang
13. Xinyuan Zhang
14. Wenbin Liu
15. Zongyuan Ge
16. Tao Lu
17. Siyuan Yan
This article has no evaluationsLatest version Mar 31, 2026
Benchmarking MeSH-Augmented Embeddings for Biomedical Document Similarity

This article has 6 authors:
1. Rohitha Ravinder
2. Lukas Geist
3. Nelson Quiñones
4. Suhasini Venkatesh
5. Leyla Jael Castro
6. Dietrich Rebholz-Schuhmann
This article has no evaluationsLatest version Apr 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

ForVA and GCM-CLIP: A Million-Scale Multimodal Dataset and Representation Learning Framework for Virtual Autopsy

A Multimodal Large Reasoning Model For Fair and Interpretable Dermatological Diagnosis Across Skin Tones

Benchmarking MeSH-Augmented Embeddings for Biomedical Document Similarity