Centroid analysis: Inferring concept representations from open-ended word responses

Aliona Petrenco
Fritz Guenther

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The present research proposes and evaluates a novel method - centroid analysis - for measuring representations and concepts at both individual and group levels by mapping open-ended responses onto a pre-existing semantic vector space. Centroid analysis allows to retrace the target concept as the geometric center of the semantic vectors of the responses generated by this concept. At the group level, centroid analysis enables researchers to compare conceptual structures across different populations to investigate how factors such as language, culture, cognitive differences, educational background, or exposure to specific narratives shape shared representations. At the individual level,centroid analysis allows for fine-grained assessments of how personal experiences, expertise, cognitive styles, or even temporary contextual influences affect conceptual representations. We evaluate this method using two distributional semantic models across several calculation methods, reference lexicon sizes, response types, and datasets with tasks ranging from single word substitutions to single and multiple free associations and multiple feature generation. We conclude that at the group level, the best method to retrace the response-generating concept as a vector in a multi-dimensional semantic space from the averaged vectors of participant responses is to collect multiple free associations (70 uniqueand 245 total responses per cue), use fastText for meaning-to-vector mapping for responses and cues, and to consider each response in the centroid calculation as often as it occurred in the data. At the individual level, the best results are achieved by employing fastText and considering at least 8 responses per item per participant in the centroid calculation.

Version published to 10.31234/osf.io/2xbuh_v1 on OSF Preprints
May 5, 2025

A two-dimensional space of linguistic representations shared across individuals

This article has 5 authors:
1. Greta Tuckute
2. Elizabeth J. Lee
3. Yongtian Ou
4. Evelina Fedorenko
5. Kendrick Kay
This article has no evaluationsLatest version May 23, 2025
Figurative Archive: an open dataset and web-based application for the study of metaphor

This article has 12 authors:
1. Maddalena Bressler
2. Veronica Mangiaterra
3. Paolo Canal
4. Federico Frau
5. Fabrizio Luciani
6. Biagio Scalingi
7. Chiara Barattieri di San Pietro
8. Chiara Battaglini
9. Chiara Pompei
10. fortunata romeo
11. Luca Bischetti
12. Valentina Bambini
This article has no evaluationsLatest version Apr 15, 2025
Uncovering patterns of semantic predictability in sentence processing

This article has 4 authors:
1. Cassandra L. Jacobs
2. Ryan Hubbard
3. Kara D. Federmeier
4. Loïc Grobol
This article has no evaluationsLatest version May 9, 2025

Listed in

Abstract

Article activity feed

Related articles

A two-dimensional space of linguistic representations shared across individuals

Figurative Archive: an open dataset and web-based application for the study of metaphor

Uncovering patterns of semantic predictability in sentence processing