Characterizing the informativeness of pathogen genome sequence datasets about transmission between population groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Pathogen genome analysis helps characterize transmission between population groups. The information carried by pathogen sequences comes from the accumulation of mutations within their genomes; thus, the pace at which mutations accumulate should determine the granularity of transmission processes that pathogen sequences can characterize. Here, we investigate how the complex interplay between mutation, transmission, population mixing and sampling impacts study power. First, we develop a conceptual probabilistic framework to quantify the ability of pairs of sequences in capturing between-group transmission history. This allows us to comprehensively explore the space of possible phylogeographic analyses by explicitly considering the pace at which mutations accumulate and the pace at which between-group transmission events occur. Using this framework, we identify a pathogen-intrinsic limit in the mixing scale at which their sequence data remains informative, with faster mutating pathogens enabling finer spatial characterization. Secondly, we perform a simulation study exploring a range of assumptions regarding sequencing intensity. We find that sample size further imposes a limit on the characterization of between-group transmission processes. This work highlights inherent horizons of observability for population mixing processes that depend on the interaction between evolution, transmission, mixing and sampling. Such considerations are important for the design of phylogeographic studies.

Article activity feed