Classification of unsequenced Mycobacterium tuberculosis strains in a high-burden setting using a pairwise logistic regression approach

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Over the past three decades, molecular epidemiological studies have provided new opportunities to investigate the transmission dynamics of Mycobacterium tuberculosis . In most studies, a sizable fraction of individuals with notified tuberculosis cannot be included, either because they do not have culture-positive disease (and thus do not have specimens available for molecular typing) or because resources for conducting sequencing are limited. A recent study introduced a regression-based approach for inferring the membership of unsequenced tuberculosis cases in transmission clusters based on host demographic and epidemiological data. This method was able to identify the most likely cluster to which an unsequenced strain belonged with an accuracy of 35%, although this was in a low-burden setting where a large fraction of cases occurred among foreign-born migrants. Here, we apply a similar model to M. tuberculosis whole-genome sequencing data from the Republic of Moldova, a setting of relatively high local transmission. Using a maximum cluster span of ~40 single nucleotide polymorphisms (SNPs) and a cluster size cutoff of n ≥10, we could best predict the specific cluster to which each clustered case was most likely to be a member with an accuracy of 17.2 %. In sensitivity analyses, we found that a more restrictive (~20 SNPs threshold) or permissive (~80 SNPs) threshold did not improve performance. We found that increasing the minimum cluster size improved prediction accuracy. These findings highlight the challenges of transmission inference in high-burden settings like Moldova.

Article activity feed