Schema Dictionaries: Quantifying Schematic Content in Narratives with Natural Language Processing

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Schemas, knowledge networks distilled from repeated experiences, play a significant role in guiding memory encoding and retrieval. Yet, despite decades of experimental research on schemas, there is currently no sensitive method for quantifying schematic content in memory narratives. In this paper, we introduce a novel approach for scoring schematic content in narratives using natural language processing. Specifically, we propose an automated pipeline that builds “schema dictionaries” with word embeddings and uses those dictionaries to score each new text for the proportion of schematic words it contains. We validated this approach across three experiments. We demonstrate convergent validity by showing that schema dictionary scores correlate with subjective ratings of memory typicality, schema scores derived from generic scene descriptions, and scores based on human ratings of word relatedness. Further, we establish discriminant validity: schema dictionaries identify substantially more words in narratives elicited by the dictionaries' associated cues than in narratives for other cues. Finally, as predicted, we show that generic scene descriptions yield higher schema scores than specific autobiographical memories. Overall, this work provides a well-validated approach for automatically scoring schematic content in narratives. We provide open-source code to enable others to quantify schematic content in memory and imagination research.

Article activity feed