Classifying Unknown and Recognizing Known Oracle Bone Characters Using Novel Data Augmentation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper addresses the challenges of recognizing Oracle Bone Characters (OBCs), an ancient form of hieroglyph used in China more than 3,500 years ago, using deep learning. While recent advancements have improved OBCs recognition, its real-world applications still face obstacles, particularly in detecting Out-of-distribution (OOD) characters, which includes two key sub-tasks: recognizing unknown OBCs and categorizing known ones. The first sub-task, recognizing unknown OBCs, has been widely overlooked in previous work but is crucial for real-world archaeological applications, where discovering new characters is important. To address these challenges, we propose two novel data augmentation(DA) methods: Dynamic GridMask, which simulates the randomness and occlusion seen in OBCs, and Intra-Class image fusion, which reduces the negative impact of image-fusion on recognizing unknown characters. Experiments on the Oracle-MNIST, OBC306, OBI125, and Oracle-241 datasets show that an increase in classification accuracy by 0.81%, 0.70%, 1.65%, and 2.41%, respectively, with an average AUROC increase of 4.50%, 4.20%, 3.71%, and 4.38%, respectively. The experiment demonstrates that our method is beneficial for both known class classification and unknown class recognition in OOD detection.