Zero-shot pseudowords memorability via representational content analysis
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Novel strings of letters (i.e., pseudowords) lack established meaning(s), yet they may still evoke systematic semantic signals that influence human behavior. Here, we tested whether semantic determinants of word memorability generalize to these novel strings. To do so we leveraged a distributional semantic model able to represent in a vector space, not only attested words but also unmapped strings as bags of character n-grams. A ridge model trained on item-level word memorability norms learned a linear mapping from 300-dimensional embeddings to recognition memorability and achieved strong out-of-fold performance. We then applied this model zero-shot to predict memorability for 2,100 phonotactically legal pseudowords whose baseline predictability was captured by orthographic and frequency features. Adding the zero-shot semantic score significantly improved the baseline model. These findings show that distributional representations derived from subword statistics carry mnemonic information that is not reducible to orthographic familiarity, and that novel strings are interpreted within a shared representational space learned from language experience. More broadly, they support the view that memorability is an intrinsic attribute predictable from representational information, even in the absence of learned meanings.