Pitch-Synchronous Biomarkers for Voice-Based Diagnostics: An Introduction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Voice analysis combined with artificial intelligence (AI) is rapidly becoming a vital tool for disease diagnosis and monitoring. A key issue is the identification of vocal biomarkers, that are quantifiable features extracted from voice to assess a person’s health status or predict the likelihood of certain diseases. Currently, the biomarkers are extracted using pitch-asynchronized methods that need improvement. Methods: Based on the timbron theory of voice production, a pitch-synchronous method of vocal biomarker extraction is proposed and demonstrated on standard voice databases, especially the ARCTIC speech databases published by Carnegie Melon University. A complete set of formant parameters and timbre vectors for all US English monophthong vowels are presented. The timbre distances among all US English monophthong vowels are presented, showing the richness and accuracy of information contained in those biomarkers. Results: The methods are then applied on the voice recordings of the Saarbrücken voice database for voice diagnostics. Accurate and reproducible measurements of timbre vectors, jitter, shimmer, and spectral irregularity are generated, showing the usefulness to diagnostics-oriented voice signals. Conclusions: The biomarkers extracted using pitch-synchronous methods might have significant advantages for voice-based diagnostics over the traditional biomarkers. To quantify its effectiveness and accuracy, the pitch-synchronous methods should be tested on large-scale databases for diagnostics tasks to compare with the traditional biomarkers. Furthermore, the method of finding glottal closing instants from voice signals should be tested on voice databases for diagnostics tasks and live voice signals.

Article activity feed