Writer Identification of Arabic Historical Document Using a Deep Learning Approaches
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Historical documents contain great information for scientific and literary research. Many documents suffer from degradation, especially on initial pages, making identifica- tion difficult when no attribution exists.Arabic historical documents have two challenges: Complexity of the script and poor physical condition. We address the problem of identity loss in Arabic historical documents by presenting a deep learning-based approach. We used a subset of the WAHD dataset comprising 16,491 images: known authors 60% and unknown authors 40%.Data augmentation was applied to enhance diversity. The data was split into 70% for training, 10% testing, and 20% validation. We implemented two models:The first, Deep Writer, is a deep convolutional neural network with a dual-path architecture, consisting of multiple convolutional, pooling, and fully con- nected layers. The second, Half Deep Writer, a similar structure but uses a single pipeline. We experimented different learning rates and found 0.0001 and 0.0002 gave optimal results. Model performance was evaluated using precision, recall, and F1-score to handle class imbalance. The Deep Writer model achieved 92.28% accuracy and an F1-score of 81.16%, while the Half Deep Writer model achieved 92.10% accuracy and an F1-score of 81.63% at a learning rate of 0.0002.