Potential Use of ChatGPT for Automated Essay Scoring Based

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The rapid advancements in Artificial Intelligence (AI) have significantly influenced educational practices, particularly in writing assessment. Automated Essay Scoring (AES) systems offer a promising alternative to traditional scoring methods by enhancing consistency, efficiency, and scalability. However, the integration of AI in high-stakes assessments like IELTS Writing Task 2 requires rigorous evaluation to ensure reliability and alignment with human judgment. This study explores the potential of ChatGPT, an advanced AI language model, as a tool for scoring essays based on IELTS Writing Task 2 criteria—Task Response, Coherence and Cohesion, Lexical Resource, and Grammatical Range and Accuracy. Employing a quantitative Associational Ex Post Facto Design, 30 essays were scored by both certified human raters and ChatGPT, using intra-class correlation coefficients (ICC) for reliabilityand MANOVA for comparative accuracy. The findings reveal that while ChatGPT demonstrates high internal consistency in scoring, significant discrepancies persist when compared to human raters, particularly in Coherence and Cohesion. These results highlight both the potential and limitations of ChatGPT in AES, suggesting that it can complement, but not yet replace, human evaluators in complex writing tasks. The study contributes to the ongoing discourse on the role of AI in education, emphasizing the need for further refinements to optimize AI-assisted assessments for fairness and precision. Beyond its theoretical contributions, this study provides practical insights for language educators, testing bodies, and policymakers on how AI can be responsibly integrated into large-scale writing assessments.

Article activity feed