AI-based Scoring of Memory Search Processes in a Creative Thinking Task
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Creative thinking has been linked to how people search their semantic memory. Past research highlighted two modes of memory search—clustering (staying within a semantic category) and switching (jumping to a new one)—which are particularly relevant for creativity. Yet, little work has investigated semantic memory search during a divergent thinking task, partly due to difficulties in scoring clustering and switching. Large language models (LLMs) have shown great promise for automatically scoring the product of creative thinking, such as the originality of ideas, yet no work to date has tested LLMs for scoring the process of creative thinking. Here, we conduct an extensive investigation into the automated scoring of clustering and switching in multilingual responses to the Alternate Uses Task (AUT; N = 5,992), a common creativity task in which people generate new uses for objects. We prompted two LLMs, GPT-4o and GPT-5, to classify AUT ideas as clustering or switching. We also trained several machine learning models on unsupervised metrics spanning text embeddings and attention weights extracted from an open-source multilingual LLM, XLM-RoBERTa. Across all our automated scoring approaches, GPT-5 “zero-shot” prompting, without any examples, achieved the highest accuracy by correctly classifying 80% of clustering and 76% of switching responses. Compared to human inter-rater agreement, GPT-5 matched 93% of the accuracy achieved by human classifications. Our findings reveal how automated scoring of memory search processes relevant for creativity can be successfully conducted across multiple languages using LLMs. Open access is provided to our machine learning models, code and dataset.