Collective and Augmented Intelligence Outperform Artificial Intelligence on Emotion Recognition Tests
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Emotion recognition is fundamental to social intelligence. As artificial intelligence (AI) becomes increasingly common, its ability to recognize human emotions has become crucial. However, whether such systems can match or surpass human experts, particularly in multimodal large language models such as GPT-4o (MLLMs), remains largely unexplored. This study evaluates GPT-4o’s emotion recognition using the Reading the Mind in the Eyes Test (RMET) and its multiracial counterpart (MRMET) against human participants from low-, mid-, and high-performing groups. Results show that, on average, GPT-4o outperforms humans in identifying emotions across both tests. This trend persists across all performance groups. Yet, when we aggregate independent human decisions to simulate collective intelligence, human groups significantly surpass the performance of aggregated GPT-4o predictions, highlighting the wisdom of the crowd effect. Moreover, an augmented intelligence condition that combines human and GPT-4o predictions achieves greater accuracy than humans or GPT-4o alone. These results suggest that, while GPT-4o exhibits strong base-level emotion recognition, the collective intelligence of humans and the potential of human-AI collaboration offer the most promising path toward effective emotion recognition. We discuss the implications of this research, highlighting where and when AI or human-AI collaboration can be most beneficial for society.