Pertsch: A Corpus of Persian and German Based on Different Speech Elicitation Tasks

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This paper introduces the Pertsch Corpus, a speech corpus designed to capture variations in speech production across a range of speech elicitation tasks in Persian and German. The corpus consists of recordings of sixty speakers who completed a fixed sequence of seven speech tasks, ranging from controlled reading tasks to more open-ended communicative situations.All recordings were collected under standardized laboratory conditions and are accompanied by orthographic transcriptions, phonetic segmentation, and annotations on verbal and nonverbal elements. The multi-task design within a single speaker allows for a systematic comparison of speech across communicative contexts, while the parallel structure across languages supports cross-linguistic analysis.Descriptive statistics provide an overview of the temporal and structural properties of the dataset and illustrate how speech organization varies depending on the task and speaker. Meanwhile, the corpus offers a flexible resource for more detailed analyses at various levels, including phonetic, prosodic, and temporal dimensions.

Article activity feed