Large-Scale Analytical Validation of Voice- Derived Digital Biomarkers Using Automated Speech Elicitation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Voice-based digital biomarkers offer a non-invasive and inherently scalable approach to monitoring physiological and psychological state. A substantial body of foundational work in speech science, respiratory physiology, and psychophysiology demonstrates that vocal production systematically reflects underlying biological processes, including respiratory mechanics, autonomic regulation, cognitive demand, and affective state. Despite this strong theoretical basis, quantitative analytical validation of voice-derived biomarkers at population scale remains limited, particularly with respect to consistency, robustness, and distributional behaviour across large biomarker portfolios. This study presents a large-scale analytical validation of the Voice Biota programme using audited model performance outputs derived from standardised automated speech elicitation. Across 46 independently composite voice biomarkers and more than 1.5 million voice-derived data points, discrimination performance was assessed using the area under the receiver operating characteristic curve (AUC). Across all biomarkers, performance was consistently high (mean AUC = 0.899, SD = 0.029; range 0.851–0.946), with all biomarkers exceeding predefined analytical acceptance thresholds commonly adopted in digital biomarker evaluation. As well as providing summary statistics; overall, we also performed a full set of distribution, cumulative, and normality analyses to describe the performance characteristics of each biomarker throughout the entire biomarker portfolio, confirming stable, unimodal performance, free of evidence of pathological skewness as well as heavy-tailed distributions or weak outliers. Taken together, these results demonstrate evidence of the analytical validity and potential scalability of voice-derived biomarkers and establish a robust empirical basis for subsequent clinical, longitudinal, and regulatory validation studies.