Quantum-Enhanced Handwritten Bangla Character Recognition: A Hybrid Quantum Classical Neural Network Approach

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Handwritten Bangla character recognition poses a significant computer vision challenge for over 250 million native speakers. The script's morphological complexity, which includes a large set of basic characters, numerals, modifiers, and intricate compound conjuncts, demands computationally robust models. While classical Convolutional Neural Networks (CNNs) have shown high accuracy, they are often demanding and struggle with subtle character variations. This paper fills a critical research gap by designing and evaluating, to our knowledge, the first Hybrid Quantum-Classical Convolutional Neural Network (HQCNN) for this task. We propose a novel architecture that integrates a quatum convolutional layer, implemented using Random Quantum Circuits (RQCs) simulated with PennyLane, as a high-dimensional feature extractor. To isolate the impact of the quantum layer, we conducted a rigorous comparative analysis of our HQCNN against a structurally identical classical CNN baseline across seven distinct experiments on four public datasets: NumtaDB, CMATERdb 3.1.2, Ekush, and BanglaLekha-Isolated. The HQCNN consistently outperformed the classical baseline in all seven tasks, achieving a peak accuracy of 99.45% on the Ekush numerical dataset. Notably, the most significant "quantum advantage" was observed in the classification of structurally complex compound characters (E6), where the HQCNN achieved 97.16% accuracy versus the baseline's 95.52% (a 1.64% improvement). Furthermore, the HQCNN demonstrated superior learning efficiency and stability. The quantum-derived features allowed the classical backbone to converge 27–43% faster (e.g., 162.78 vs. 223.37 minutes of classical backbone training time on the 84-class mixed dataset) and with a more stable validation loss. These results provide strong evidence that the quantum convolutional layer captures more expressive feature representations, demonstrating that the static RQC layer that the static RQC layer, by operating in a high-dimensional Hilbert space, provides a more expressive feature representation than a classical kernel, making the classification task easier for the classical backbone. While classical simulation of the quantum circuits adds significant computational overhead, our findings on faster convergence strongly suggest that an implementation on native quantum hardware would offer a substantial wall-clock speedup, presenting a viable and highly efficient paradigm for advancing OCR systems for Bangla and other complex Indic scripts.

Article activity feed