Human Experts Vs. LLMs: Who is Better at Explaining Students’ Clustering into Knowledge Profiles?

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large Language Models (LLMs) are increasingly used in educational settings to enhance assessment and feedback.While prior research has focused primarily on their ability to score responses or model learners’ knowledge,less attention has been given to their use in explaining outputs of machine learning algorithms – a key goal ofexplainable AI (XAI) in education. In this study, we explore the capacity of ChatGPT to generate natural-languageexplanations of student knowledge profiles derived from clustering analysis of multi-item chemistry assessments.These explanations are compared to those authored by human experts, with 16 chemistry teachers evaluatingboth versions in a blind review. While ChatGPT’s explanations were generally preferred for profiles representingsimpler student performance patterns, human-authored explanations were favored for more complex profilesrequiring nuanced pedagogical reasoning. Our findings highlight the capabilities and limitations of LLMs ingenerating high-level explanations of algorithmic outputs and suggest that relying on LLMs to analyze multi-itemassessment data may actually work against students with more complex knowledge structures.

Article activity feed