LeCoder: A Large-Scale Automated Coder for Coding Errors in Word Production Tasks

Shanhua Hu
Delaney DuVal
Brielle C Stark
Nazbanou Nozari

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Speech errors have been instrumental in advancing our understanding of the architecture of the language production system, the nature of its representations, and its disorders. To be most informative, researchers usually need large amounts of data. Hand-coding such data can be both cumbersome and subjective. This paper presents LeCoder, the first open-source, automated error coder, which uses a data-driven approach grounded in large-scale corpora to quantify the target-response relationship, allowing it to be flexible, scalable, and generalizable across new datasets. By testing the coder on two datasets from two aphasia labs that have been carefully coded by trained research assistants, we first establish that LeCoder has high accuracy when compared to expert coders, and in certain cases, offers a more logical categorization than human coders. We then show, using robust machine-learning approaches, that LeCoder’s performance generalizes to new participants and items it has never encountered before. Collectively, these findings encourage the use of LeCoder across labs for more objective coding of speech errors, which will, in turn, increase replicability of findings in all subfields of research that use speech error analysis, including neuropsychological research.

Version published to 10.31234/osf.io/jhng4_v1 on OSF Preprints
Jul 31, 2025

Accents Still Confuse AI: Systematic Errors in Speech Transcription and LLM-Based Remedies

This article has 4 authors:
1. Yasaman Fatapour
2. Jamil S. Samaan
3. Inclusive AI Research Group
4. Nicholas P Tatonetti
This article has no evaluationsLatest version Sep 2, 2025
Analysis: Serving Individuals with Language Impairments using Automatic Speech Recognition Models and Large Language Models: Challenges and Opportunities

This article has 13 authors:
1. Yiyu Shi
2. Ruiyang Qin
3. Haoxinran Yu
4. Lixuan Wei
5. Yuxuan Liu
6. Dancheng Liu
7. Chenhui Xu
8. Jiajie Li
9. Gelei Xu
10. Ahmed Abbasi
11. Jinjun Xiong
12. Xiufan Yu
13. Zhi Zheng
This article has no evaluationsLatest version Jul 24, 2025
What is the retest reliability of computationally extractable speech and language markers?

This article has 9 authors:
1. DERYA Cokal
2. Martin Villalba
3. Rui He
4. Claudio Flores Palominos
5. Annkathrin Böke
6. Philipp Homan
7. Klaus von Heusinger
8. Joseph Kambeitz
9. Wolfram Hinzen
This article has no evaluationsLatest version Jul 28, 2025

Listed in

Abstract

Article activity feed

Related articles

Accents Still Confuse AI: Systematic Errors in Speech Transcription and LLM-Based Remedies

Analysis: Serving Individuals with Language Impairments using Automatic Speech Recognition Models and Large Language Models: Challenges and Opportunities

What is the retest reliability of computationally extractable speech and language markers?