Accents Still Confuse AI: Systematic Errors in Speech Transcription and LLM-Based Remedies

Yasaman Fatapour
Jamil S. Samaan
Inclusive AI Research Group
Nicholas P Tatonetti

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate and timely documentation in the electronic health record (EHR) is essential for delivering safe and effective patient care. AI-enabled medical tools powered by automatic speech recognition (ASR) offer to streamline this process by transcribing clinical conversations directly into structured notes. However, a critical challenge in deploying these technologies at scale is their variable performance across speakers with diverse accents, which leads to transcription inaccuracies, misinterpretation, and downstream clinical risks. We measured transcription accuracy of Whisper and WhisperX on clinical texts across native and non-native English speakers and found that both models have significantly higher errors for non-native speakers. Fortunately, we found that post-processing the transcripts using GPT-4o recovers the lost accuracy. Our findings indicate that using a chained model approach, WhisperX-GPT, will enhance transcription quality significantly and reduce errors associated with accented speech. We make all code, models, and pipelines freely available.

Version published to 10.1101/2025.08.29.25333548 on medRxiv
Sep 2, 2025

Analysis: Serving Individuals with Language Impairments using Automatic Speech Recognition Models and Large Language Models: Challenges and Opportunities

This article has 13 authors:
1. Yiyu Shi
2. Ruiyang Qin
3. Haoxinran Yu
4. Lixuan Wei
5. Yuxuan Liu
6. Dancheng Liu
7. Chenhui Xu
8. Jiajie Li
9. Gelei Xu
10. Ahmed Abbasi
11. Jinjun Xiong
12. Xiufan Yu
13. Zhi Zheng
This article has no evaluationsLatest version Jul 24, 2025
Acoustic-Driven Generation of Pathological Speech Reports Using Large Language Models

This article has 9 authors:
1. Tomas Arias-Vergara
2. Lukas Buess
3. Nastassia Vysotskaya
4. Soroosh Tayebi Arasteh
5. Juan Rafael Orozco-Arroyave
6. Maria Schuster
7. Elmar Noeth
8. Andreas Maier
9. Paula Andrea Perez-Toro
This article has no evaluationsLatest version Aug 19, 2025
LeCoder: A Large-Scale Automated Coder for Coding Errors in Word Production Tasks

This article has 4 authors:
1. Shanhua Hu
2. Delaney DuVal
3. Brielle C Stark
4. Nazbanou Nozari
This article has no evaluationsLatest version Jul 31, 2025

Listed in

Abstract

Article activity feed

Related articles

Analysis: Serving Individuals with Language Impairments using Automatic Speech Recognition Models and Large Language Models: Challenges and Opportunities

Acoustic-Driven Generation of Pathological Speech Reports Using Large Language Models

LeCoder: A Large-Scale Automated Coder for Coding Errors in Word Production Tasks