Task-based assessment reveals teamwork capabilities of large language models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We employed a modified battery of cognitive and social tests from psychological sciences to capture key team processes that support overall team effectiveness. These tests were administered to five instruction-tuned large language models (LLMs) to evaluate whether these models can function as effective Artificial Intelligence (AI) team members by contributing to action, transition, and interpersonal team processes. All assessments were performed on LLMs without human interaction. Our results revealed concrete strengths and shortcomings of LLMs to engage in teaming behaviours. LLMs performed well on tasks usually executed in action phases of teamwork, where a team engages in specific work tasks such as monitoring and coordination to meet team goals. In contrast, LLMs lacked the abstract reasoning and planning required to meet transition phases of teamwork. These phases are critical for self-reflection of past performance and planning for upcoming tasks in teamwork. Finally, LLMs displayed unique performance during interpersonal aspects of teamwork, predominately applying an integrating motivational style but failing to acknowledge negative emotional states of teammates. These findings reveal how contemporary LLMs may contribute to contemporary human-AI teams as team members and highlight the utility of experimental psychology in elucidating behaviours of non-embodied agents.