AI as a Partner in Assessment: Generating Situational Judgment Tests with Large Language Models

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Situational Judgment Tests (SJTs) are important tools for evaluating individual abilities by presenting specific scenarios and assessing the responses. This study explores the potential of Large Language Models (LLMs) in developing SJTs, with emotional regulation as an example. We proposed and evaluated two approaches using ERNIE 4.0, an advanced Chinese LLM: generating new items from scratch (Approach 1) and adapting items from existing examples (Approach 2). The quality of generated items was assessed through expert evaluations and empirical testing with 93 psychology undergraduates and 184 participants from a general population pool. Results indicate that items generated by ERNIE 4.0, particularly new items from Approach 1, met or exceeded the quality of existing tests in both expert evaluations and empirical validations, while adapted items showed some limitations. This research demonstrates LLMs' potential for developing valid assessments, offering insights for future applications.

Article activity feed