Testing Standards for AI-based Scores in Automated Essay Scoring

Rudolf Debelak
Matthias Ziegler

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Recent developments in computer science and in particular in the field of artificial intelligence and machine learning allow the wide application of large language models for the evaluation of written text and other non-numerical data. When applied in the context of psychological and educational assessments, such models can be used for assigning scores to essays and other types of responses. In contrast to classical tests, essays do not consist of test items, which leads to specific challenges in the evaluation of testing standards for scores obtained from AI models that differ from those observed for classical ability tests and personality questionnaires. To address these challenges, we discuss the evaluation of validity, fairness, and reliability for scores obtained from models of artificial intelligence in the context of automated essay scoring. This discussion includes a review of the methods developed so far and the proposal of new methods. We further illustrate the proposed methods with an empirical example. The development of additional methods is suggested.

Version published to 10.31234/osf.io/vnq63_v3 on OSF Preprints
Feb 25, 2025
Version published to 10.31234/osf.io/vnq63_v2 on OSF Preprints
Feb 25, 2025
Version published to 10.31234/osf.io/vnq63_v1 on OSF Preprints
Nov 12, 2024

Anchor Is the Key: Toward Accessible Automated Essay Scoring with Large Language Models Through Prompting

This article has 5 authors:
1. Jaeyoon Choi
2. Tamara Powell Tate
3. Daniel Ritchie
4. Nia Nixon
5. mark warschauer uci
This article has no evaluationsLatest version Apr 20, 2025
Assessing the robustness of automated scoring of divergent thinking tasks with adversarial examples.

This article has 3 authors:
1. Yannick Hilker
2. Boris Forthmann
3. Philipp Doebler
This article has no evaluationsLatest version Apr 28, 2025
The Challenges of Artificial Intelligence in the English Language Teaching, Learning, and Academic Publications

This article has 1 author:
1. Khairi O. Al-Zubaidi
This article has no evaluationsLatest version May 5, 2025

Listed in

Abstract

Article activity feed

Related articles

Anchor Is the Key: Toward Accessible Automated Essay Scoring with Large Language Models Through Prompting

Assessing the robustness of automated scoring of divergent thinking tasks with adversarial examples.

The Challenges of Artificial Intelligence in the English Language Teaching, Learning, and Academic Publications