Distinguishing new from persistent infections at the strain level using longitudinal genotyping data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
Longitudinal pathogen genotyping data from individual hosts can uncover strain-specific infection dynamics and their relationships to disease and intervention, especially in the malaria field. An important use case involves distinguishing newly incident from pre-existing (persistent) strains, but implementation faces statistical challenges relating to individual samples containing multiple strains, strains sharing alleles, and markers dropping out stochastically during the genotyping process. Current approaches to distinguish new versus persistent strains therefore rely primarily on simple rules that consider only the time since alleles were last observed.
Results
We developed DINEMITES ( Di stinguishing Ne w M alaria I nfections in T im e S eries), a set of statistical methods to estimate, from longitudinal genotyping data, the probability each sequenced allele represents a new infection harboring that allele, the total molecular force of infection (molFOI, the cumulative number of newly acquired strains over time) for each individual, and the total number of new infection events for each individual. DINEMITES can handle time points with missing sequencing data, incorporate treatment history and covariates affecting the rate of new or persistent infections, and can scale to studies with thousands of samples sequenced across multiple loci containing hundreds of possible alleles. In synthetic evaluations, the DINEMITES Bayesian model, which generally outperformed an alternative clustering-based model also developed in this work, accurately estimated key clinical parameters such as molFOI (bias 2.5, compared to −12.2 for a typical simple rule). When applied to three real longitudinal genotyping datasets, the model detected 33%, 112%, and 359% more average infections per participant than would have been detected by applying a typical simple rule to the equivalent datasets without sequencing.
Availability and implementation
DINEMITES is freely available as an R package, along with documentation, tutorials, and example data, at https://github.com/WillNickols/dinemites .