Voice as a Digital Biomarker: Foundation Model-Based COPD Assessment

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Chronic obstructive pulmonary disease (COPD) is frequently underdiagnosed due to limited access to spirometry, highlighting the need for simple and scalable screening tools. Voice offers an easily obtainable and widely accessible respiratory biomarker, yet its potential for COPD assessment remains largely unexplored. Thus, we developed a voice-only COPD assessment model leveraging self-supervised speech representations from a wav2vec 2.0 foundation model using patient-recorded voice. We collected 1,709 recordings from 277 participants (227 COPD, 50 controls) across maximum phonation and standardized reading tasks captured both before and after a 30-second chair-stand test. Without handcrafted acoustic features or clinical variables, the model accurately predicted COPD presence and severity. The post-exercise reading condition achieved the highest performance, with an AUC of 0.81 for COPD detection and 0.71 for severity classification. Age-stratified analysis showed strong discrimination in adults younger than 65 years (AUC 0.87), while severity prediction remained consistent across age groups. These findings demonstrate that exertion-induced vocal alterations, combined with foundation-model representations, encode physiologic signatures of airflow limitation, enabling a practical and scalable approach for COPD screening and remote monitoring.

Article activity feed