Cross-Script Generalization in STR: Solving the Orthographic Diversity Challenge with Global Semantic Segmentation

Wang Su
Yusuke Aizawa
Ling Li

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The proliferation of multilingual digital content necessitates robust scene text recognition (STR) systems that transcend language barriers. While monolingual models achieve superhuman performance in high-resource languages like English, their cross-lingual capabilities remain constrained by orthographic diversity and typological differences. This study challenges three fundamental assumptions in multilingual STR through systematic experimentation: \begin{enumerate} \item The linguistic transferability hypothesis \item The typological similarity principle \item The optimal resource allocation paradigm \end{enumerate} Our investigations reveal that dataset cardinality supersedes language similarity as the dominant factor in cross-lingual adaptation. By developing a meta-transfer learning framework with dynamic resource allocation (Equation \ref{eq:capacity}), we achieve 94\% accuracy in transfer performance predictions while maintaining compatibility with legacy systems. The proposed methodology demonstrates 41\% improvement in cross-lingual robustness indices compared to conventional approaches, establishing new benchmarks for multilingual STR development.

Version published to 10.20944/preprints202506.0267.v1
Jun 4, 2025

Grammar-Driven Text Segmentationfor Context Understanding of Myanmar Language

This article has 3 authors:
1. myo thida
2. Nu Wei Thet
3. Thein Kyaw LWIN
This article has no evaluationsLatest version Jan 23, 2026
Advancing Dialectal Arabic to Modern Standard Arabic Machine Translation

This article has 3 authors:
1. Abdullah Alabdullah
2. Lifeng HAN
3. Chenghua Lin
This article has no evaluationsLatest version Jan 22, 2026
Construction of a cross-domain machime translation model based on meta-learing and semlantic transfer

This article has 1 author:
1. Yongjian Wang
This article has no evaluationsLatest version Jan 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Grammar-Driven Text Segmentationfor Context Understanding of Myanmar Language

Advancing Dialectal Arabic to Modern Standard Arabic Machine Translation

Construction of a cross-domain machime translation model based on meta-learing and semlantic transfer