A BLEU-Based Comparative Analysis of Human and ChatGPT 4.0 Translation in Kumpulan Lagu dan Cerita Anak- anak Dwibahasa

Amon Bernabas Tenis
Adi Sytrisno

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study aims to compare the translation quality of human translators and ChatGPT 4.0 using the Bilingual Evaluation Understudy (BLEU) metric, focusing on twelve stories from Kumpulan Lagu dan Cerita Anak-Anak Dwibahasa. The purpose of this research is to examine how closely ChatGPT 4.0’s translations align with human translations in terms of lexical and structural similarity. The methodology includes four main stages: preparing human and machine translation outputs, performing tokenization, calculating n-gram precision, and computing the final BLEU scores based on geometric means and brevity penalties. The findings reveal that ChatGPT 4.0 consistently produced translations that were longer and more stylistically elaborated than the human references, resulting in BLEU scores ranging from 0.4859 to 0.9068. These results indicate that although ChatGPT 4.0 can generate fluent and contextually appropriate translations, its outputs do not closely match human translations at the n-gram level. The study concludes that BLEU remains effective for measuring surface-level similarity but is limited in capturing stylistic and interpretive aspects of AI-generated translation in children’s literature.

Version published to 10.21203/rs.3.rs-9118811/v1 on Research Square
Mar 24, 2026

Towards High-Quality Machine Translation for Kokborok: A Low-Resource Tibeto-Burman Language of Northeast India

This article has 2 authors:
1. Badal Nyalang
2. Biman Debbarma
This article has no evaluationsLatest version Mar 31, 2026
CrossLingBench: A Comprehensive Evaluation ofLarge Language Models on Multilingual NLPTasks Across Languages and Prompting Strategies

This article has 1 author:
1. Ahmed Cherif
This article has no evaluationsLatest version Apr 17, 2026
A Uyghur–Chinese parallel dataset of proverbs

This article has 6 authors:
1. xiang yi
2. Maitiyasen Duolaitiniyazi
3. Sulaiman Maitusun
4. Feiruzai Abulikemu
5. Najiye Aimilajiang
6. Abuduwaili Keremu
This article has no evaluationsLatest version Apr 17, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Towards High-Quality Machine Translation for Kokborok: A Low-Resource Tibeto-Burman Language of Northeast India

CrossLingBench: A Comprehensive Evaluation ofLarge Language Models on Multilingual NLPTasks Across Languages and Prompting Strategies

A Uyghur–Chinese parallel dataset of proverbs