A Benchmark Study of Classical and U-Net ResNet34 Methods for Binarization of Balinese Palm Leaf Manuscripts

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Ancient documents that have undergone physical and visual degradation pose significant challenges in the digital recognition and preservation of information. This research aims to evaluate the effectiveness of ten classic binarization methods, including Otsu, Niblack, Sauvola, and ISODATA, as well as other adaptive methods, in comparison to the U-Net ResNet34 model trained on 256 × 256 image blocks for extracting textual content and separating it from the degraded parts and background of palm leaf manuscripts. We focused on two significant collections, Lontar Terumbalan, with a total of 19 images of Balinese manuscripts from the National Library of Indonesia Collection, and AMADI Lontarset, with a total of 100 images from ICHFR 2016. Results show that the deep learning approach outperforms classical methods in terms of overall evaluation metrics. The U-Net ResNet34 model reached the highest Dice score of 0.986, accuracy of 0.983, SSIM of 0.938, RMSE of 0.143, and PSNR of 17.059. Among the classical methods, ISODATA achieved the best results, with a Dice score of 0.957 and accuracy of 0.933, but still fell short of the deep learning model across most evaluation metrics.

Article activity feed