Word frequency and contextual diversity measures for Singapore English

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Singapore English, shaped by unique historical and sociocultural factors, diverges significantly from American and British English varieties. Despite its centrality in Singapore's communication landscape, comprehensive word frequency estimates for this dialect have been lacking, limiting psycholinguistic research that accurately reflects local language processing. Dialectal variation fundamentally influences cognitive mechanisms, and neglecting these differences can distort theoretical models of language processing. This study addresses this gap by constructing word frequency (WF) and contextual diversity (CD) estimates from the National Speech Corpus (NSC; Koh et al., 2019), a large-scale speech corpus of Singapore English. We analyzed three NSC subparts comprising over 10,000 transcripts and 30+ million words to generate frequency estimates for 50,000 words, including both standard and uniquely Singaporean terms. To validate these measures, we used lexical decision data from the Auditory English Lexicon Project (Goh et al., 2020), featuring Singaporean participants responding to stimuli across different accent conditions. Linear and binomial mixed-effect models compared NSC-based estimates against SUBTLEX-US (Brysbaert et al., 2012) and SUBTLEX-UK (van Heuven et al., 2014) in predicting reaction times and accuracy. NSC estimates produced the lowest AIC values across both British and Singaporean accents, indicating superior fit to local language processing patterns. These results demonstrate that WF and CD effects are shaped by specific language experiences and highlight the necessity of dialect-appropriate psycholinguistic tools. This database advances cognitive science by enabling more accurate modeling of dialect-sensitive language processing and underscores the importance of developing similar resources for other English varieties worldwide.

Article activity feed