Internet search patterns reveal clinical course of COVID-19 disease progression and pandemic spread across 32 countries

This article has been Reviewed by the following groups

Read the full article

Abstract

Effective public health response to novel pandemics relies on accurate and timely surveillance of pandemic spread, as well as characterization of the clinical course of the disease in affected individuals. We sought to determine whether Internet search patterns can be useful for tracking COVID-19 spread, and whether these data could also be useful in understanding the clinical progression of the disease in 32 countries across six continents. Temporal correlation analyses were conducted to characterize the relationships between a range of COVID-19 symptom-specific search terms and reported COVID-19 cases and deaths for each country from January 1 through April 20, 2020. Increases in COVID-19 symptom-related searches preceded increases in reported COVID-19 cases and deaths by an average of 18.53 days (95% CI 15.98–21.08) and 22.16 days (20.33–23.99), respectively. Cross-country ensemble averaging was used to derive average temporal profiles for each search term, which were combined to create a search-data-based view of the clinical course of disease progression. Internet search patterns revealed a clear temporal pattern of disease progression for COVID-19: Initial symptoms of fever, dry cough, sore throat and chills were followed by shortness of breath an average of 5.22 days (3.30–7.14) after initial symptom onset, matching the clinical course reported in the medical literature. This study shows that Internet search data can be useful for characterizing the detailed clinical course of a disease. These data are available in real-time at population scale, providing important benefits as a complementary resource for tracking pandemics, especially before widespread laboratory testing is available.

Article activity feed

  1. SciScore for 10.1101/2020.05.01.20087858: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    For languages in which we were unable to recruit native speakers for translations, we used Google Translate26 and confirmed that sufficient data were available for these translated search terms on Google Trends.22 A complete table of search terms and translations for each country is provided in Supplementary Table S-1.
    Google
    suggested: (Google, RRID:SCR_017097)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    The use of Internet search data is subject to a number of important limitations.46,10,47 Internet infrastructure and digital access levels differ across countries and communities. Even though digital access rates are generally increasing worldwide, many developing countries currently lack sufficient search volumes to support search-based tracking. Search data may be subject to demographic, socio-economic, geographic, or other biases inherent in the local digital divide.15,48,49 In each country, the population of individuals who perform Internet searches may have different characteristics than those who do not, and the results inferred from Internet searching behaviors may not generalize to other populations. Changes in search volumes for symptom-related terms such as “fever” could result not only from increases in COVID-19 cases, but also from general curiosity about the pandemic, the occurrence of other diseases (e.g. influenza, Lassa fever50), news coverage, or other factors. In this study, we used specific symptom-related search terms and examined data from a geographically diverse set of 32 counties across six continents and multiple languages. In recognition of the inherent variability of the data at the individual country level, we performed a global analysis combining data from all 32 countries. Despite the inherent variability of country-specific search data, our results show that the temporal relationships between the symptom-specific Internet search terms and COVID-...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.