How well do Large Language Models perform as team members? Testing teamwork capabilities of LLMs
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We employed a modified battery of cognitive and social tests from psychological sciences intended to reflect key components of teamwork. These tests were given to five instruction-tuned large language models (LLMs) to examine the capabilities of LLMs to contribute to teamwork. Our results revealed concrete strengths and shortcomings of LLMs to engage in teaming behaviours. Language models performed well on action phases of teamwork, where a team engages in specific work tasks such as monitoring and coordination to meet team goals. In contrast, LLMs lacked the abstract reasoning and planning required to meet transition phases of teamwork. These moments are critical for self-reflection of past performance and planning for upcoming tasks in teams. Finally, LLMs had unique behaviours for interpersonal aspects of teamwork, predominately applying an integrating motivational style but failing to acknowledge negative emotional states of teammates. These findings reveal how ‘off-the-shelf’ LLMs may contribute to contemporary human-AI teams and highlight the utility of experimental psychology in elucidating behaviours of non-embodied agents.