Near real-time surveillance of the SARS-CoV-2 epidemic with incomplete data

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

When responding to infectious disease outbreaks, rapid and accurate estimation of the epidemic trajectory is critical. However, two common data collection problems affect the reliability of the epidemiological data in real time: missing information on the time of first symptoms, and retrospective revision of historical information, including right censoring. Here, we propose an approach to construct epidemic curves in near real time that addresses these two challenges by 1) imputation of dates of symptom onset for reported cases using a dynamically-estimated “backward” reporting delay conditional distribution, and 2) adjustment for right censoring using the NobBS software package to nowcast cases by date of symptom onset. This process allows us to obtain an approximation of the time-varying reproduction number ( R t ) in real time. We apply this approach to characterize the early SARS-CoV-2 outbreak in two Spanish regions between March and April 2020. We evaluate how these real-time estimates compare with more complete epidemiological data that became available later. We explore the impact of the different assumptions on the estimates, and compare our estimates with those obtained from commonly used surveillance approaches. Our framework can help improve accuracy, quantify uncertainty, and evaluate frequently unstated assumptions when recovering the epidemic curves from limited data obtained from public health systems in other locations.

Article activity feed

  1. SciScore for 10.1101/2021.01.25.20230094: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our approach has several limitations. First, its validity relies on the assumption that the date of symptoms onset is missing at random and that the available historical data were sufficient to parameterize the unknown reporting delay. However, our sensitivity analyses indicate that the overall trajectory of the epidemic curve was relatively robust to small departures from these assumptions. Second, our approach underperforms well when little information is available for training the nowcasting algorithm, including good estimates of the reporting delay distribution. Our estimates were sensitive to the choice of imputation and nowcasting procedures when the date of symptoms onset was unknown for a high percentage of confirmed cases. Third, our approach requires that underdetection/underreporting of cases does not change significantly over time; otherwise would adversely affect the estimates [23]. Finally, our estimates could be improved by reconstructing the epidemic curve by the date of infection rather than that of symptoms onset, though this would require more complex methods given that the temporal delay from infection to symptom onset is much harder to characterize [17,24–27]. Development of ready-to-use tools for epidemic dynamics modelling help surveillance services to appropriately present data for efficient epidemic control, but understanding the limitations of the procedure and the impact of prespecified assumptions is critical for interpretation. We believe that the...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.