Evaluating ChatGPT-4o’s Performance in Construction of Q-Matrix for a Cognitive Diagnostic Assessment
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study evaluates the performance of ChatGPT-4o in constructing Q-matrices for cognitive diagnostic assessments by comparing its outputs with those constructed by researchers and human experts. The research examines the overlap rates among these Q-matrices and assesses their validity using empirical methods. Two distinct mathematics datasets were used, and the Q-matrices were validated through statistical techniques to determine their model-data fit. The results indicate that ChatGPT-4o can generate Q-matrices with a high degree of overlap rate to those specified by human experts, demonstrating its potential as a tool for cognitive diagnostic assessments. The study highlights that AI-generated Q-matrices can be a valuable supplement to traditional methods, but expert validation remains essential to ensure theoretical accuracy and practical applicability. The findings suggest that a hybrid approach—integrating AI-based Q-matrix construction with expert refinement—can enhance the accuracy and efficiency of cognitive diagnostic assessments.