AASE: AI-Driven Automated Answer Script Evaluation

Mitra Abhi Sura
Maitreyee Rai
SONIA KHETARPAUL
Saurabh Mishra

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In today’s educational systems, evaluating student answerscripts are challenging due to the varied grading criteria, different types of questions, and different ways of attempting the same question. Traditional manual grading can often be inconsistent, inefficient, and sometimes prone to bias, making it difficult to ensure fairness in assessments. These challenges are further complicated when dealing with different types of answers, such as written responses in English, mathematical solutions, etc. Each of these requires distinct approaches, increasing the workload and chances of errors in manual grading. To address these challenges, we propose an Automated Answer Script Evaluation (AASE) system. This solution leverages advanced Natural Language Processing (NLP) techniques along with mathematical parsing algorithms to automate the grading process comprehensively. The proposed AASE system is trained on a diverse dataset containing various grading criteria and incorporates multiple techniques for evaluating both English-based and mathematical answers. For English responses, the system employs different approaches like keyword matching and the Word Movers Distance (WMD) algorithm, a BERT model with additional layer and BERTbased model with dropout layers for sequence classification. These models ensure accurate and unbiased evaluations of student answers in English. Additionally, the AASE system integrates optical character recognition (OCR) technology to recognize handwritten mathematical expressions, which are then converted into LaTeX format using an encoder-decoder architecture. The converted expressions undergo evaluation providing flexibility in assessing both direct answers and detailed step-wise solutions. The AASE system is trained and tested on real-world datasets, and has achieved an accuracy of 80.45% for Subjective and 76% for Mathematical answers evaluation.

Version published to 10.21203/rs.3.rs-6895375/v1 on Research Square
Jul 17, 2025

Elementary Math Word Problem Generation using Large Language Models

This article has 12 authors:
1. Nimesh Ariyarathne
2. Harshani Bandara
3. Yasith Heshan
4. Omega Gamage
5. Surangika Ranathunga
6. Dilan Nayanajith
7. Yutharsan Sivapalan
8. Gayathri Lihinikaduarachchi
9. Tharoosha Vihidun
10. Meenambika Chandirakumar
11. Sanujen Premakumar
12. Sanjula Gathsara
This article has no evaluationsLatest version Sep 2, 2025
Evaluating an LLM’s Performance in Annotating Discourse Strategies

This article has 2 authors:
1. Taylor Meizlish
2. Chris Ziffo
This article has no evaluationsLatest version Sep 2, 2025
Large Language Models as Mediators: Addressing Rater Disagreement in Turkish Essay Scoring

This article has 5 authors:
1. Burak Aydın
2. Tarık Kışla
3. Nursel Tan Elmas
4. Emrah Boylu
5. Okan Bulut
This article has no evaluationsLatest version Aug 4, 2025

Listed in

Abstract

Article activity feed

Related articles

Elementary Math Word Problem Generation using Large Language Models

Evaluating an LLM’s Performance in Annotating Discourse Strategies

Large Language Models as Mediators: Addressing Rater Disagreement in Turkish Essay Scoring