Population-based sequencing of Mycobacterium tuberculosis reveals how current population dynamics are shaped by past epidemics

Read the full article See related articles



Transmission has been proposed as a driver of tuberculosis (TB) epidemics in high-burden regions, with negligible impact in low-burden areas. Genomic epidemiology can greatly help to quantify transmission in different settings but the lack of whole genome sequencing population-based studies has hampered its use to compare transmission dynamics and contribution across settings.


We generated an additional population-based sequencing dataset from Valencia Region, a low burden setting, and compared it with available datasets from different TB settings to reveal heterogeneity of transmission dynamics and its public health implications. We sequenced the whole genome of 785 M. tuberculosis strains and linked genomes to patient epidemiological data. We applied a pairwise distance clustering approach and phylodynamics methods to characterize transmission events over the last 150 years, in Valencia, Spain (low burden), Oxfordshire, United Kingdom (low burden) and a high-burden (Karonga, Malawi).


Our results revealed high local transmission in the Valencia Region (47.4% clustering), in contrast to Oxfordshire (27% clustering), and similar to a high-burden setting like Malawi (49.8% clustering). By modelling times of the transmission events, we observed that settings with high transmission are associated with uninterrupted transmission of strains over decades, irrespective of burden.


Our results underscore significant differences in transmission between TB settings even with similar burdens, reveal the role of past epidemic in on-going TB epidemic and highlight the need for in-depth characterization of transmission dynamics and specifically-tailored TB control strategies.


European Research Council under the European Union’s Horizon 2020 research and innovation program (Grants 638553-TB-ACCELERATE, 101001038-TB-RECONNECT), and Ministerio de Ciencia e Innovación (Spanish Government, SAF2016-77346-R and PID2019-104477RB-I00)

Article activity feed

  1. Evaluation Summary:

    This work presents in-depth epidemiologic and phylogenetic analyses of tuberculosis cases across Valencia, Spain and comparator low-burden (Oxfordshire, UK) and high-burden (Karonga, Malawi) regions. Findings reveal that the "low burden" observed in Valencia is not in fact reflective of low transmission in this setting, with detected lineages likely to have circulated locally over the course of decades and to have been transmitted in the community.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their name with the authors.)

  2. Reviewer #2 (Public Review):

    The authors present a 3-year population study of M. tuberculosis transmission in Valencia, Spain.
    They set out to assess the epidemiology and transmission dynamics of M. tuberculosis in their region, and evaluate how valid SNP thresholds, defined in other settings, might be in Valencia, or indeed elsewhere.

    The authors achieve dense sampling (77% of all culture positive cases). They show how local transmission accounts for much of the local case load, and nicely demonstrate with a ROC curve that the 11.5 SNPs achieves optimal sensitivity and specificity when used as a threshold for contact tracing. The authors also use time scaled phylogenetic reconstruction of historical transmission events to show that local strains have been circulating for over 150 years.

    The authors draw 4 conclusions, namely that transmission can still be an important factor even in a low burden setting; that much transmission occurs in the community rather than in the household; that generous SNP threshold capture transmission links that are often no longer relevant to contact tracing effort; and that where strains are endemic, a continuum of relatedness (SNP distance) can be observed. They conclude, correctly, that the public health response needs to be informed by knowledge of the local problem, and that epidemiological patterns can be quite different even across similar social-economic settings with similar incidence of disease. These conclusions are important and should encourage others to investigate their localities in such detail rather than drawing inferences from the findings of ostensibly similar settings.

    It is however also important to be clear on the distinction between the use of SNP threshold for directing contact investigations and the use of SNP thresholds for understanding the transmission dynamics in a population over time (e.g. the behaviour of an endemic strain). There may be a role for SNP thresholds when directing contact tracing, although this remains controversial, whereas the authors correctly show that threshold are far less useful for understanding the longer term behaviour of local strains.

  3. Reviewer #1 (Public Review):

    This paper aims to look at 4 main things:
    1. The correlation between a country's/region's TB burden and the level of local transmission
    2. The relevance of proposed SNP cut-offs for defining transmission clusters in different settings
    3. The link between genetic clusters of Mycobacterium tuberculosis isolates and transmission clusters proposed by contact tracing methods
    4. The effect of historical local transmission on current day epidemiological dynamics

    Overall, the paper achieves many of these goals to different extents, with strong support for the first three aims. However, some difficulties with modelling past transmission dynamics leaves the last aim, and indeed that contained within the title, much less well supported.

    1. The correlation between a country's/region's TB burden and the level of local transmission
    One of the primary findings of the paper is that low burden does not mean low local transmission. This is what is often purported, based primarily on work in the UK, which the authors nicely show has very different dynamics than Valencia. There is strong evidence presented here that the same public health actions cannot be used to eliminate TB in all low burden settings, although this could be better outlined in the discussion.

    2. The relevance of proposed SNP cut-offs for defining transmission clusters in different settings
    The 12 SNP cut-off is used almost universally to define recent transmission of M. tuberculosis, even though it was only originally demonstrated in a low burden, low transmission setting and then subsequently linked to timespans in a small MDR-TB dataset in a high burden country. The demonstration that the 12 SNP threshold means very different things in different settings is well presented and is a necessary point to make. The results outlined here add well to previous suggestions on this point but is likely the strongest evidence for setting-specific cut-offs, or abolition of cut-offs completely, that has been published to date.

    3. The link between genetic clusters of Mycobacterium tuberculosis isolates and transmission clusters proposed by contact tracing methods
    This finding is well supported by the sensitivity/specificity and accuracy measures. The correlations between SNP cut-off and epidemiological link are in line with several previous publications so are to be expected. The fact that transmission is also likely occurring more in the community than the household has been reported by some in high burden settings but not well known in low burden settings, which this work clearly shows, but could be better highlighted in the discussion.

    4. The effect of historical local transmission on current day epidemiological dynamics
    This section is perhaps where the paper is either most confusing or least supported. While it is very likely that the current population structure is heavily influenced by past transmission dynamics, the approach to this question is not easily understood. The Bayesian analysis is primarily well done with adequate prior settings and MCMC parameters, although the apparent use of a SNP alignment without ascertainment bias correction has a strong chance to produce inaccuracies in the time tree. The historical estimation of transmission events over 150 years implicitly has many assumptions, such as transmission burden being the same in each country and local transmission excluding export and subsequent re-import, making it difficult to understand and extrapolate the findings with any certainty.