A Bayesian model for repeated cross-sectional epidemic prevalence survey data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Epidemic prevalence surveys monitor the spread of an infectious disease by regularly testing representative samples of a population for infection. State-of-the-art Bayesian approaches for analysing epidemic survey data were constructed independently and under pressure during the COVID-19 pandemic. In this paper, we compare two existing approaches (one leveraging Bayesian P-splines and the other approximate Gaussian processes) with a novel approach (leveraging a random walk and fit using sequential Monte Carlo) for smoothing and performing inference on epidemic survey data. We use our simpler approach to investigate the impact of survey design and underlying epidemic dynamics on the quality of estimates. We then incorporate these considerations into the existing approaches and compare all three on simulated data and on real-world data from the SARS-CoV-2 REACT-1 prevalence study in England. All three approaches, once appropriate considerations are made, produce similar estimates of infection prevalence; however, estimates of the growth rate and instantaneous reproduction number are more sensitive to underlying assumptions. Interactive notebooks applying all three approaches are also provided alongside recommendations on hyperparameter selection and other practical guidance, with some cases resulting in orders-of-magnitude faster runtime.

Author summary

Understanding how infections spread in a population is crucial during an epidemic, and largescale surveys that test people for current infection can provide valuable insights. These surveys are resource-intensive and the data they produce can be noisy and hard to interpret. In this study, we investigate how three different statistical approaches (two established and one novel) can explore such data. We found that some common modelling choices, particularly the treatment of observation noise, can meaningfully shape results. Our findings highlight the need for careful, robust methods to help researchers and public health officials make best use of existing data, design more effective surveys, and extract clearer insights from future studies.

Article activity feed