An Analysis of English Vowel Variation in Pakistani vs. Arabic Talkers: A Computational Acoustic and Machine Learning Approach

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The paper examines the acoustic properties of the production of the English vowels by the non-native speakers with two language and cultural backgrounds, namely Pakistani English (PakE) and Arabic English (ArE). The study, through a multi-methodological framework premised on machine learning, explores the impact of the first language on the production of English vowels amongst native speakers of Pahari in Pakistan and Arabic speakers in Saudi Arabia (KSA). The task of the participants (10 participants per region, mixed-sex) was to create a list of English words with specific emphasis on 10 target vowels inserted into carrier sentences with CVC (hVd) structure and no pauses. F1 and F2 formant frequencies and the duration of the vowel were extracted using PRAAT version 6.1.04. Analysis and visualisation of this data was performed in Python and involved the use of vowel space plots, computation of Euclidean distances, and patterns of clustering among the speakers. Vowel classification and predicting speaker groups were analyzed by supervised and unsupervised machine learning algorithms, including k-means clustering and logistic regression. This was the process that demonstrated phonological patterns in the two groups with system. The results indicated that there were consistent internal differences in each of the groups and significant differences compared to the British English vowel targets. These findings indicate that PakE and ArE have organized phonological regulations. The implications of the study are on the teaching of pronunciation, building of speech recognition systems, and the development of region-specific text-to-speech (TTS) synthesisers. The study also discusses the importance of open-source tools in computational phonetics, with Python-based analysis becoming a common element of code-driven processing.

Article activity feed