Development of a vocal biomarker for fatigue monitoring in people with COVID-19

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Objective

To develop a vocal biomarker for fatigue monitoring in people with COVID-19.

Design

Prospective cohort study.

Setting

Predi-COVID data between May 2020 and May 2021.

Participants

A total of 1772 voice recordings was used to train an AI-based algorithm to predict fatigue, stratified by gender and smartphone’s operating system (Android/iOS). The recordings were collected from 296 participants tracked for two weeks following SARS-CoV-2 infection.

primary and secondary outcome measures

Four machine learning algorithms (Logistic regression, k-nearest neighbors, support vector machine, and soft voting classifier) were used to train and derive the fatigue vocal biomarker. A t-test was used to evaluate the distribution of the vocal biomarker between the two classes (Fatigue and No fatigue).

Results

The final study population included 56% of women and had a mean (±SD) age of 40 (±13) years. Women were more likely to report fatigue ( P< . 001 ). We developed four models for Android female, Android male, iOS female, and iOS male users with a weighted AUC of 79%, 85%, 86%, 82%, and a mean Brier Score of 0.15, 0.12, 0.17, 0.12, respectively. The vocal biomarker derived from the prediction models successfully discriminated COVID-19 participants with and without fatigue (t-test P< . 001 ).

Conclusions

This study demonstrates the feasibility of identifying and remotely monitoring fatigue thanks to voice. Vocal biomarkers, digitally integrated into telemedicine technologies, are expected to improve the monitoring of people with COVID-19 or Long-COVID.

Article activity feed

  1. SciScore for 10.1101/2022.03.01.22271496: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The first, Type 1 audio, required participants to read paragraph 1 of article 25 of the Declaration of Human Rights21, in their preferred language: French, German, English, or Portuguese; and the second, Type 2 audio, required them to hold the [a] vowel phonation without breathing for as long as they could (see Supplementary Online Material 1 for more details).
    Portuguese
    suggested: None
    The Mel spectrograms were passed through VGG19 convolutional neural network architecture provided by Keras, which was pre-trained on the ImageNet database27.
    ImageNet
    suggested: None
    A 10-fold cross-validation procedure was conducted on the training cohort participants to evaluate four classification models (logistic regression (LR), k-nearest neighbors (KNN), support vector machine (SVM), and soft voting classifier (VC), scikit-learn implementation in Python) at different regularization levels via a grid search, with the following evaluation metrics: area under the ROC curve (AUC), accuracy, F1-score, precision, and recall.
    Python
    suggested: (IPython, RRID:SCR_001658)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Study Limitations: This study has several limitations. First, although our data was stratified based on gender and smartphone devices, the mix of languages might also result in different voice features subsequently, in different model performances. There is presently no comparable dataset with similar audio recordings for further external validation of our findings. Thus, more data should be collected to improve the transferability of our vocal biomarker to other populations. Second, our data labeling was only based on a qualitative self-reported fatigue status. A fatigue severity scale would allow a quantitative assessment of fatigue severity in a uniform and unbiased way throughout all participants. Finally, time series voice analysis for each participant was not included in the study. More investigation, including time series analysis, would establish a personalized baseline for each participant, potentially enhancing the performance of our vocal biomarkers.

    Results from TrialIdentifier: We found the following clinical trial numbers in your paper:

    IdentifierStatusTitle
    NCT04380987RecruitingLuxembourg Cohort of Positive Patients for COVID-19: a Strat…


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.