A Study on OCR-Based Answer Sheet Evaluation Systems

Chris Mathew Joseph
Devika Vinod
Dheeraj Krishna
Haritha P
Josmi Jose
Sabeena K
Sulaja Sanal

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This paper presents a comprehensive literature survey on Optical Character Recognition (OCR)-based systems for the automated evaluation of handwritten answer sheets, emphasizing the integration of modern deep learning and lan- guage understanding techniques. The review consolidates re- search spanning handwritten text recognition (HTR), post-OCR correction, writer-adaptive learning, and multimodal assessment that combines textual, mathematical, and diagrammatic inputs. Recent developments leveraging large vision–language models (VLMs) such as GPT-4V are analyzed for their potential to perform semantic comparison and rubric-aware grading of handwritten solutions. The survey also examines transformer- based architectures, meta-learning frameworks for unseen writer adaptation, and hybrid OCR pipelines integrating CNN, RNN, and attention mechanisms. Key datasets, benchmark results, and performance trends across diverse educational and language settings are discussed. In addition, the paper identifies major challenges such as OCR noise propagation, reasoning inconsis- tencies in large language models, and domain-specific calibration requirements for STEM assessments. By synthesizing current progress and limitations, this work aims to provide a structured foundation for developing future end-to-end, multimodal, and semantically aware AI-driven evaluation systems for scalable and reliable academic assessment. Index Terms—OCR, Handwritten Text Recognition, Answer Sheet Evaluation, Semantic Correction, Transformers, CTC, Meta-learning.

Version published to 10.20944/preprints202510.1533.v1
Oct 20, 2025

Benchmarking OCR and Vision-Language Models for Turkish Text Recognition: A Comprehensive Evaluation Using Synthetic Data

This article has 4 authors:
1. Yasin Yılmaz
2. Erol Görkem Hanoğlu
3. Ayşe Gül Özkan
4. Kasım Öztoprak
This article has no evaluationsLatest version Oct 14, 2025
A Novel Approach for Text Extraction and Word Segmentation from Handwritten Document Images Using CNN-RNN Technique

This article has 2 authors:
1. Dimpy Singh
2. Shalini Puri
This article has no evaluationsLatest version Sep 11, 2025
A Hybrid TF–IDF and SBERT Approach for Enhanced Text Classification Performance

This article has 3 authors:
1. Muntazir Mehdi
2. Saqlain Mushtaq
3. Ghulam Rabbani Butt
This article has no evaluationsLatest version Oct 31, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Benchmarking OCR and Vision-Language Models for Turkish Text Recognition: A Comprehensive Evaluation Using Synthetic Data

A Novel Approach for Text Extraction and Word Segmentation from Handwritten Document Images Using CNN-RNN Technique

A Hybrid TF–IDF and SBERT Approach for Enhanced Text Classification Performance