Large-Scale Analytical Validation of Voice- Derived Digital Biomarkers Using Automated Speech Elicitation

Adrian Attard Trevisan
Frederick R Carrick
Andrea Sprio

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Voice-based digital biomarkers offer a non-invasive and inherently scalable approach to monitoring physiological and psychological state. A substantial body of foundational work in speech science, respiratory physiology, and psychophysiology demonstrates that vocal production systematically reflects underlying biological processes, including respiratory mechanics, autonomic regulation, cognitive demand, and affective state. Despite this strong theoretical basis, quantitative analytical validation of voice-derived biomarkers at population scale remains limited, particularly with respect to consistency, robustness, and distributional behaviour across large biomarker portfolios. This study presents a large-scale analytical validation of the Voice Biota programme using audited model performance outputs derived from standardised automated speech elicitation. Across 46 independently composite voice biomarkers and more than 1.5 million voice-derived data points, discrimination performance was assessed using the area under the receiver operating characteristic curve (AUC). Across all biomarkers, performance was consistently high (mean AUC = 0.899, SD = 0.029; range 0.851–0.946), with all biomarkers exceeding predefined analytical acceptance thresholds commonly adopted in digital biomarker evaluation. As well as providing summary statistics; overall, we also performed a full set of distribution, cumulative, and normality analyses to describe the performance characteristics of each biomarker throughout the entire biomarker portfolio, confirming stable, unimodal performance, free of evidence of pathological skewness as well as heavy-tailed distributions or weak outliers. Taken together, these results demonstrate evidence of the analytical validity and potential scalability of voice-derived biomarkers and establish a robust empirical basis for subsequent clinical, longitudinal, and regulatory validation studies.

Version published to 10.21203/rs.3.rs-8810881/v1 on Research Square
Feb 18, 2026

Dynamic HRV Assessment Based on Gamma Auditory Stimulation

This article has 19 authors:
1. Zuojun Cao
2. Nianze Chen
3. Yiqi Tang
4. Ling Zhu
5. Hongtao Ji
6. Yue Shen
7. Xinwei Tang
8. Weiqiang Cai
9. Xi Zhang
10. Qun Zhang
11. Yuanyuan Chu
12. Longwen He
13. Ning Yang
14. YanQin Xi
15. Guheng Pan
16. Junfa Wu
17. Wu Yi
18. Junwei lv
19. Hongyu Xie
This article has no evaluationsLatest version Mar 20, 2026
Impact of Soundscapes on Mental Well-Being and Physiological Stress: A Systematic Review and Meta-Analysis

This article has 2 authors:
1. Harley Glassman
2. Frank Russo
This article has no evaluationsLatest version Feb 12, 2026
Severity-Dependent Speech Characteristics and Clear Speech Response in Parkinson’s Disease: Perceptual, Acoustic, and Lingual Kinematic Findings

This article has 3 authors:
1. Austin Thompson
2. Lifeng Lin
3. Yunjung Kim
This article has no evaluationsLatest version Mar 25, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Dynamic HRV Assessment Based on Gamma Auditory Stimulation

Impact of Soundscapes on Mental Well-Being and Physiological Stress: A Systematic Review and Meta-Analysis

Severity-Dependent Speech Characteristics and Clear Speech Response in Parkinson’s Disease: Perceptual, Acoustic, and Lingual Kinematic Findings