Network topology outweighs emergence probability in surveillance sentinel placement

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Many studies have focused on understanding spatial heterogeneity in infectious disease emergence probability; however, how this information can be leveraged to optimize sentinel node selection for early outbreak detection on complex networks remains largely unexplored.

Methods

We simulated outbreaks on diverse synthetic and empirical networks, and quantified early detection performance as the average reduction in outbreak size when detected by a given sentinel set. We first used genetic algorithms to identify optimal sentinel sets and understand characteristics potentially relevant to early detection performance. We then trained a Random Forest-based Surrogate Model (RFSM) with the identified characteristics to assess the relative importance of different network features and to enable a rapid prediction of node selection rank. RFSM was benchmarked against five alternative surveillance strategies on networks not used in training to evaluate generalizability. Sensitivity analyses were conducted to examine how feature importance varied with network structure and epidemiological parameters.

Results

Surveillance strategies incorporating emergence probability outperformed those based solely on network topology, but the improvement was modest across all examined scenarios. Dynamic selection features capturing overlapping information among sentinel sites, such as the proportion of a candidate node’s neighbors that have already been selected, were the most important determinant of early detection, followed by global and node topology-related features. Emergence probability-related features were less influential but gained importance with greater node degree heterogeneity, larger variability in the emergence probability distribution, and greater negative correlation between node degree and emergence probability. Selecting only six sentinels achieved approximately 90% of the performance of full-network surveillance. RFSM achieved comparable performance to Genetic Algorithm (GA), while requiring only 1/24,000 of GA’s computational time on a network with 200 nodes.

Conclusions

Information about spatial heterogeneity in emergence probability provided limited additional benefit beyond network topology in selecting sentinel nodes for early outbreak detection on complex networks. The online tool RFSM offers a ready-to-use, computationally efficient and robust framework to support the design of effective disease surveillance networks.

Article activity feed