Classifying Child-Directed Speech and Shared Book Reading from LENA: Language-Specific Modeling and Temporal Resolution Effects
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Language directed to children predicts early language development, motivating efforts to automate annotation in large-scale naturalistic corpora. Prior validation has focused on Western languages, underexploring typologically distinct systems such as Korean. This study evaluated machine learning–based classification of caregiver–child interactional contexts in daylong Korean recordings, addressing three research objectives: examining the relative performance of cross-linguistic transfer versus language-specific training; evaluating the impact of 1-minute versus 5-minute temporal resolution in training; and exploring the automatic detection of shared book reading (BR), a low-frequency but developmentally important subtype of child-directed speech. The model pretrained on English and Spanish recordings generalize poorly to Korean data, while training the same model architecture on Korean recordings substantially improved performance. This indicates that patterns captured in LENA-derived acoustic and conversational features may not be readily portable across languages without adaptation. Automated detection of shared book reading showed moderate reliability, likely reflecting its sparse distribution in naturalistic data, though classification performance showed strong discriminative ability. These findings support the feasibility of scalable, automated analysis of early language environments in a non-Western context, and highlight the importance of language-specific training for extending automated approaches across diverse linguistic and cultural contexts.