Predicting Onset of COVID-19 with Mobility-Augmented SEIR Model

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Timely interventions and early preparedness of healthcare resources are crucial measures to tackle the COVID-19 disease. To aid these efforts, we developed the Mobility-Augmented SEIR model (MA-SEIR) that leverages Google’s aggregate and anonymized mobility data to augment classic compartmental models. We show in a retrospective analysis how this method can be applied at an early stage in the COVID-19 epidemic to forecast its subsequent spread and onset in different geographic regions, with minimal parameterization of the model. This provides insight into the role of near real-time aggregate mobility data in disease spread modeling by quantifying substantial changes in how populations move both locally and globally. These changes would be otherwise very hard to capture using less timely data.

Article activity feed

  1. SciScore for 10.1101/2020.07.27.20159996: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The ODE solver “odeint” in SciPy [31] was used to solve this system.
    SciPy
    suggested: (SciPy, RRID:SCR_008058)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:

    These results should be interpreted in light of several important limitations. First, the Google mobility data is limited to smartphone users who have opted in to Google’s Location History feature, which is off by default. These data may not be representative of the population as whole, and furthermore their representativeness may vary by location. Importantly, these limited data are only viewed through the lens of differential privacy algorithms, specifically designed to protect user anonymity and obscure fine detail. Moreover, comparisons across rather than within locations are only descriptive since these regions can differ in substantial ways. In our work and in relation to the dataset described above, the “location” can be a country, state, or county. The “flow” represents the number of users from source to destination locations at the same granularity. The temporal bucket is weekly. Our mobility data doesn’t have sufficient data coverage in China. According to published data, in 2018 Hong Kong has one of the busiest airports in terms of international passenger traffic (74 million) [36], whereas the international passenger traffic volume of Mainland China in 2018 is estimated to be 64 million [37]. We therefore use the mobility data in Hong Kong to approximate the mobility flow originated from China. Regardless of the granularity, all mobility data are protected by the privacy policies described above. COVID-19 dataset Country-level COVID-19 case numbers are obtained from NSSAC (Network Systems Science and Advanced Computing) daily case reports from the University of Virginia [38]. US county-level COVID-19 confirmed case numbers are downloaded from the New York Times [39]. Experiments Three experiments are designed to evaluate the performance of the proposed MA-SEIR approach and understand the effects mobility in the spread of COVID-19. We assume that the mobility data beyond the date when the simulation is performed is unknown and can only be estimated. 1. Global country-level modeling of COVID-19 spread is assumed to be performed on January 30 2019 to forecast the epidemic onset, with seeds from reported confirmed cases before this date. The simulation starting date is on December 1st 2019. Therefore the mobility flows used in the simulation includes real-time mobility data for December 2019, January 2020, and approximated mobility for year 2020 (using data from February to May 2019). 2. State-level modeling of COVID-19 spread for United States is assumed to be performed on February 14, 2020, with seeds from reported confirmed cases before this date. The start date of the mobility data is the same as the first date of the simulations. In order to simulate real forecasting scenario, mobility data beyond the date of last seeding infection (February 14, 2020) was replaced by the average mobility data prior to that date. 3. Different levels of simulated reduction of mobility are then incorporated into the MA-SEIR model to evaluate their effects in epidemic onsets. The onset dates derived from these simulations with various degrees of mobility reduction are compared. An epidemic onset is technically defined as the time when the number of cases is above what is normally expected (“epidemic threshold”) [40]. Different approaches of calculating the epidemic threshold have been proposed [41]. For SARS-CoV-2, there is no consensus epidemic threshold yet. In this work, a subjective fixed epidemic threshold is defined in order to quantify the model performance. We use 100 and 30 cumulative confirmed cases as the epidemic thresholds for countries and states, respectively. To evaluate the model forecasting accuracy, we compute the mean and median of absolute errors in the actual and predicted onset days. We also summarize the positive (early prediction) and negative (late prediction) error individually.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.

  2. SciScore for 10.1101/2020.07.27.20159996: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The ODE solver “odeint” in SciPy [31] was used to solve this system.
    SciPy
    suggested: (SciPy, RRID:SCR_008058)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

  3. SciScore for 10.1101/2020.07.27.20159996: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Overall, the MA-SEIR model predicted the onset of COVID-19 epidemic with good accuracy (MAE: 11.9 days).
    MAE
    suggested: (Nevada-Reno University Center for Bioinformatics Core Facility, SCR_017802)
    The ODE solver “odeint” in SciPy [31] was used to solve this system.
    SciPy
    suggested: (SciPy, SCR_008058)

    Data from additional tools added to each annotation on a weekly basis.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.