Characterizing spatiotemporal patterns of case reporting backfill: a case study of COVID-19 reporting in Michigan, 2020–24
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Backfill is the process of revising case data, often by retrospectively assigning or reassigning newly reported cases to earlier symptom onset dates. Time- and spatial-varying delays in the backfill process may compromise real-time surveillance and forecasting efforts by obscuring the true underlying transmission patterns. Using Michigan COVID-19 case data, we developed a statistical mixture model to describe backfill and geographical and temporal variations. The model combined an exponential process (case reporting delay) and a gamma-distributed process (date reassignment). Parameters were estimated by maximum likelihood with lasso regularization, and the Akaike Information Criterion was used to determine the necessity of the reassignment component for each date. We estimated the exponential reporting speed over time and space and, if appropriate, the transient peak and time of case reassignment. We found that case reporting improved over the pandemic: reporting speed increased over time (with substantial day-to-day variation), and case reassignments were processed faster. We also identified potential regional disparities: rural regions with population densities below 50 people/km 2 had slower backfill speeds. These findings provide critical insights about the evolution of case reporting and backfill dynamics that can be leveraged for “nowcasting” models to complete real-time surveillance data, ultimately improving outbreak preparedness and response.