Are Eight Chatbots Better Than One? Boosting Chatbot Creative Outcomes via Exposure to Self- and Peer-Generated Examples

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

An important aspect of the emerging field of human-AI co-creativity concerns how users can consistently make the most of whichever AI systems they have at their disposal. To advance this know-how and provide practical insights, the present study reports an empirical exploratory investigation examining if, and how, exposure to self- and peer-generated examples affects the creative performance of chatbots. We introduce two strategies: (a) “Pick & Mix”, which involves selecting, combining, and enhancing elements from examples, and (b) “Try to Beat”, which uses examples as baselines to outperform. We test these strategies with eight widely used chatbots (ChatGPT, Claude, Copilot, DeepSeek, Gemini, Grok, Meta, and Perplexity) in realistic usage settings, using a two-round multi-iteration process involving two standardized creativity tasks, the Divergent Association Task (DAT) and the Alternative Uses Test (AUT). Findings indicate that Pick & Mix is an effective and simple approach for improving chatbots’ creative performance. In contrast, Try to Beat is generally ineffective and rarely outperforms Pick & Mix outcomes. Overall, the findings suggest that chatbots can repeatedly identify and improve the best available candidates within a set of provided examples, but have difficulty extracting and reusing task-relevant features from them to generate consistently improved alternative results.

Article activity feed