The universality of vowel intrinsic F0: A corpus study of cross-linguistic and inter-speaker variability in 75 languages
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Human speech is marked by incredible diversity, yet it also exhibits recurring regularities across languages that reveal shared cognitive constraints on linguistic representation. One example is vowel intrinsic F0 (VF0), the tendency for high vowels to be produced with a slightly higher fundamental frequency (F0; the primary acoustic correlate to perceived pitch) than low vowels. While previous research has identified VF0 in several languages, its cross-linguistic scope and underlying mechanisms remain only partly characterized. Here we test whether VF0 arises as a biomechanical consequence of uniform phonetic representations or as a controlled enhancement that strengthens perceptual contrast. Using the largest and most diverse cross-linguistic dataset to date—speech from 75 languages representing 18 language families and over 60,000 speakers—we compare vowels differing in height as well as vowels differing only in backness, providing a critical test of the uniformity account. We further examine how VF0 varies across speakers, languages, vowel inventory sizes, and word prosodic systems (tonal or non-tonal). The results reveal a robust and directionally consistent VF0 effect across nearly all languages, indicating a strong bias toward underlyingly uniform phonetic targets in vowel production. At the same time, variation in the magnitude of the effect and its relationship to language structure and speaker-specificity suggest that controlled enhancement operates as a flexible, communicatively motivated adjustment. Together, these findings demonstrate how large-scale phonetic data can uncover universal constraints on speech while linking the physical realization of language to its underlying cognitive representation.