Deep learning-based detection of COVID-19 using wearables data
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
COVID-19 is an infectious disease caused by SARS-CoV-2 that is primarily diagnosed using laboratory tests, which are frequently not administered until after symptom onset. However, SARS-CoV-2 is contagious multiple days before symptom onset and diagnosis, thus enhancing its transmission through the population.
Methods
In this retrospective study, we collected 15 seconds to one-minute heart rate and steps interval data from Fitbit devices during the COVID-19 period (February 2020 until June 2020). Resting heart rate was computed by selecting the heart rate intervals where steps were zero for 12 minutes ahead of an interrogated time point. Data for each participant was divided into train or baseline by taking the days before the non-infectious period and test data by taking the days during the COVID-19 infectious period. Data augmentation was used to increase the size of the training days. Here, we developed a deep learning approach based on a Long Short-Term Memory Networks-based autoencoder, called LAAD, to predict COVID-19 infection by detecting abnormal resting heart rate in test data relative to the user’s baseline.
Findings
We detected an abnormal resting heart rate during the period of viral infection (7 days before the symptom onset and 21 days after) in 92% (23 out of 25 cases) of patients with laboratory-confirmed COVID-19. In 56% (14) of cases, LAAD detection identified cases in their pre-symptomatic phase whereas 36% (9 cases) were detected after the onset of symptoms with an average precision score of 0·91 (SD 0·13, 95% CI 0·854–0·967), a recall score of 0·36 (0·295, 0·232–0·487), and a F-beta score of 0·79 (0·226, 0·693–0·888). In COVID-19 positive patients, abnormal RHR patterns start 5 days before symptom onset (6·9 days in pre-symptomatic cases and 1·9 days later in post-symptomatic cases). COVID-19+ patients have longer abnormal resting heart rate periods (89 hours or 3·7 days) as compared to healthy individuals (25 hours or 1·1 days).
Interpretation
These findings show that deep learning neural networks and wearables data are an effective method for the early detection of COVID-19 infection. Additional validation data will help guide the use of this and similar techniques in real-world infection surveillance and isolation policies to reduce transmission and end the pandemic.
Funding
This work was supported by NIH grants and gifts from the Flu Lab, as well as departmental funding from the Stanford Genetics department. The Google Cloud Platform costs were covered by Google for Education academic research and COVID-19 grant awards.
Research in context
Evidence before the study
COVID-19 resulted in up to 1·7 million deaths worldwide in 2020. COVID-19 detection using laboratory tests is usually performed after symptom onset. This delay can allow the spread of viral infection and can cause outbreaks. We searched PubMed, Google, and Google Scholar for research articles published in English up to Dec 1, 2020, using common search terms including “COVID-19 and wearables”, “Resting heart rate and viral infection”, “Resting heart rate and COVID-19”, “machine learning and COVID-19” and “deep-learning and COVID-19”. Previous studies have attempted to use an elevated resting heart rate as an indicator of viral infection. Although these studies have investigated the early prediction of COVID-19 using resting heart rate and other wearables data, studies to investigate a deep learning-based prediction model with performance evaluation metrics at the user level has not been reported.
Added value of this study
In this study, we created a deep-learning system that used wearables data such as abnormal resting heart rate to predict COVID-19 before the symptom onset. The deep-learning system was created using retrospective time-series datasets collected from 25 COVID-19+ patients, 11 non-COVID-19, and 70 healthy individuals. To our knowledge, this is the first deep-learning model to identify an early viral infection using wearables data at the user level. This study also greatly extends our previous phase-1 study and factors unpredictable behavior and time-series nature of the data, limited data size, and lack of data labels to evaluate performance metrics. The use of a real-time version of this model using more data along with user feedback may help to scale early detection as the number of patients with COVID-19 continues to grow.
Implications of all the available evidence
In the future, wearable devices may provide high-resolution sleep, temperature, saturated oxygen, respiration rate, and electrocardiogram, which could be used to further characterize an individual’s baseline and improve the deep-learning model performance for infectious disease detection. Using multi-sensor data with a real-time deep-learning model has the potential to alert individuals of illness prior to symptom onset and may greatly reduce the viral spread.
Article activity feed
-
SciScore for 10.1101/2021.01.08.21249474: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Visualization: All the plots were generated using the following libraries - Matplotlib https://matplotlib.org/; Seaborn https://seaborn.pydata.org/; ggplot https://ggplot2.tidyverse.org/. Matplotlibsuggested: (MatPlotLib, RRID:SCR_008624)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:However, our study has several limitations. First, all symptom onset dates were self-reported by the patients, usually after diagnostic confirmation of COVID-19. Since the data was …
SciScore for 10.1101/2021.01.08.21249474: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Visualization: All the plots were generated using the following libraries - Matplotlib https://matplotlib.org/; Seaborn https://seaborn.pydata.org/; ggplot https://ggplot2.tidyverse.org/. Matplotlibsuggested: (MatPlotLib, RRID:SCR_008624)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:However, our study has several limitations. First, all symptom onset dates were self-reported by the patients, usually after diagnostic confirmation of COVID-19. Since the data was divided into training and test sets and all metrics calculations were based on the self-reported symptom onset date, errors in self-reporting can introduce bias in our results and model performance. Second, we divided the symptomatic period into pre-symptomatic and post-symptomatic periods using recent studies on the viral infectiousness period. However, the length of these periods could vary substantially from one patient to another and thus introduce bias in our results. Third, none of the healthy patients had COVID-19 tests, and it is therefore possible that some of them had asymptomatic infections. Indeed, we found 11 healthy cases where abnormal RHR was detected for 3·7 days (89 hours) more during the infectious period, similar to COVID-19 patients (appendix 2 pp 12). Fourth, on average only approximately 3 months of data was collected per user and thus deriving training, validation and test data from such limited data may be a limiting factor for the model performance. Fifth, we did not test any confirmed COVID19 negative cases, which limits the potential of our study. Sixth, only 25 COVID-19 cases were used in the analysis. Adding more samples will improve our understanding of wearable data performance in detecting COVID-19. Seventh, all data used in this study were collected from Fitbit sma...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-