Human Experts Vs. LLMs: Who is Better at Explaining Students’ Clustering into Knowledge Profiles?
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large Language Models (LLMs) are increasingly used in educational settings to enhance assessment and feedback.While prior research has focused primarily on their ability to score responses or model learners’ knowledge,less attention has been given to their use in explaining outputs of machine learning algorithms – a key goal ofexplainable AI (XAI) in education. In this study, we explore the capacity of ChatGPT to generate natural-languageexplanations of student knowledge profiles derived from clustering analysis of multi-item chemistry assessments.These explanations are compared to those authored by human experts, with 16 chemistry teachers evaluatingboth versions in a blind review. While ChatGPT’s explanations were generally preferred for profiles representingsimpler student performance patterns, human-authored explanations were favored for more complex profilesrequiring nuanced pedagogical reasoning. Our findings highlight the capabilities and limitations of LLMs ingenerating high-level explanations of algorithmic outputs and suggest that relying on LLMs to analyze multi-itemassessment data may actually work against students with more complex knowledge structures.