Large language model evaluating theory of mind tasks in a gamified environment

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Autism Spectrum Disorder often significantly affects reciprocal social communication, leading to difficulties in inter-preting social cues, recognizing emotions, and maintaining verbal interactions. These challenges can make everydayconversations especially demanding. To support people with Autism Spectrum Disorder in developing their socialcompetence and communication abilities, we propose an interactive game specifically designed to enhance socialunderstanding. By incorporating gamification elements and a user-centered design approach, the application aimsto balance clinical relevance with high usability, ensuring it remains accessible, engaging, and beneficial for anyoneseeking to improve their social skills.Large Language Models have recently been assessed for their ability to detect sarcasm and irony within theory of mindtasks, showing performance comparable to that of trained psychologists. However, a significant limitation remains: theirdependence on traditional "black box" AI architectures, which often lack explainability, interpretability, and transparency.This limitation is particularly concerning when people with and without Autism Spectrum Disorder use these models tolearn and practice social skills in safe, virtual environments.This study investigates and compares the performance of Large Language Models and human experts in evaluatingTheory of Mind tasks, providing a detailed comparative analysis. A total of 21 participants engaged with our game, andtheir responses were assessed by four human experts alongside GPT-4o. The results indicate that GPT-4o matcheshuman experts in both adherence to instructional criteria and evaluation accuracy, with no statistically significantdifferences observed. These findings underscore the potential of LLMs to support scalable, always-available socialtraining systems that are accessible from anywhere.

Article activity feed