Evaluating ChatGPT-4o’s Performance in Construction of Q-Matrix for a Cognitive Diagnostic Assessment

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study evaluates the performance of ChatGPT-4o in constructing Q-matrices for cognitive diagnostic assessments by comparing its outputs with those constructed by researchers and human experts. The research examines the overlap rates among these Q-matrices and assesses their validity using empirical methods. Two distinct mathematics datasets were used, and the Q-matrices were validated through statistical techniques to determine their model-data fit. The results indicate that ChatGPT-4o can generate Q-matrices with a high degree of overlap rate to those specified by human experts, demonstrating its potential as a tool for cognitive diagnostic assessments. The study highlights that AI-generated Q-matrices can be a valuable supplement to traditional methods, but expert validation remains essential to ensure theoretical accuracy and practical applicability. The findings suggest that a hybrid approach—integrating AI-based Q-matrix construction with expert refinement—can enhance the accuracy and efficiency of cognitive diagnostic assessments.

Article activity feed