System for Detection and Recognition of Historical Arabic Manuscripts
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Optical Character Recognition (OCR) technology automates the extraction and recognition of text from scanned documents or images, leveraging machine learn- ing models trained on standardized datasets. Historical Arabic manuscripts, housed in national libraries and archives around the world, hold immense cultural, religious, and historical significance. However, manual analysis is time-consuming and complex due to the cursive nature of Arabic script, context-dependent char- acter shapes, and document degradation over time. This research aims to develop a robust OCR system for detecting and recognizing text in historical Arabic manuscripts. By training machine learning models on curated datasets, the system will produce accurate digitized text, enabling historians and archaeologists to analyze content efficiently. The deployment and evaluation of the system in real world scenarios will support cultural preservation and enhance historical understanding.