The mutational signatures of poor treatment outcomes on the drug-susceptible Mycobacterium tuberculosis genome

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    In this useful study, a GWAS-type analysis is applied to clinical Mycobacterium tuberculosis isolates to discover genetic polymorphisms linked to poor tuberculosis outcomes. The evidence for the detected associations is still incomplete, as the corresponding polymorphisms are not adequate to power a prediction model for infection outcome, although key host factors - including patient age, sex, and duration of diagnostic delay (which have stronger predictive value) - appear to enhance predictive capacity.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Drug resistance is a known risk factor for poor tuberculosis (TB) treatment outcomes, but the contribution of other bacterial factors to poor outcomes in drug-susceptible TB is less well understood. Here, we generate a population-based dataset of drug-susceptible Mycobacterium tuberculosis (MTB) isolates from China to identify factors associated with poor treatment outcomes. We analyzed whole-genome sequencing (WGS) data of MTB strains from 3196 patients, including 3105 patients with good and 91 patients with poor treatment outcomes, and linked genomes to patient epidemiological data. A genome-wide association study (GWAS) was performed to identify bacterial genomic variants associated with poor outcomes. Risk factors identified by logistic regression analysis were used in clinical models to predict treatment outcomes. GWAS identified fourteen MTB fixed mutations associated with poor treatment outcomes, but only 24.2% (22/91) of strains from patients with poor outcomes carried at least one of these mutations. Isolates from patients with poor outcomes showed a higher ratio of reactive oxygen species (ROS)-associated mutations compared to isolates from patients with good outcomes (26.3% vs 22.9%, t-test, p=0.027). Patient age, sex, and duration of diagnostic delay were also independently associated with poor outcomes. Bacterial factors alone had poor power to predict poor outcomes with an AUC of 0.58. The AUC with host factors alone was 0.70, but increased significantly to 0.74 (DeLong’s test, p=0.01) when bacterial factors were also included. In conclusion, although we identified MTB genomic mutations that are significantly associated with poor treatment outcomes in drug-susceptible TB cases, their effects appear to be limited.

Article activity feed

  1. Author Response

    Reviewer #2 (Public Review):

    The availability of large collections of Mycobacterium tuberculosis (Mtb) isolates has enabled many important studies looking to identify mycobacterial genetic polymorphisms associated with anti-tuberculosis (TB) drug resistance, including both classical "resistance-conferring" mutations and novel "resistance-enabling" mutations. Importantly, these studies have expanded our understanding of mycobacterial genetic adaptations undermining chemotherapy, in many cases allowing for improved diagnostic tests and predictions of treatment failure. In this submission, Gao and colleagues adopt a different approach to the problem: although also applying a GWAS-type analysis, they instead attempt to elucidate polymorphisms implicated in poor outcomes of TB patients undergoing treatment for the drug-susceptible disease. Starting with a large dataset comprising 3496 samples with corresponding clinical (host) metadata, the authors generate Mtb whole-genome sequence data for 91 samples obtained from patients with "poor" outcomes and 3105 patients with "good" outcomes. These are used to identify 14 fixed and >230 unfixed mutations that might be associated with "poor" treatment outcomes, a conclusion which they argue is plausible given transcriptional evidence implicating many of the identified genes in the mycobacterial response in vitro to first-line drug exposure and/or hypoxia, both of which are considered relevant to clinical disease. Notably, they also identify a tendency for a greater proportion of "ROS mutational signatures" in unfixed mutations from "poor" outcome samples. Finally, incorporating these observations in a prediction model, the authors observe that the mycobacterial factors aren't adequate on their own but, when combined with key host factors - including patient age, sex, and duration of diagnostic delay (which have stronger predictive value) - they enhance predictive capacity. In summary, this paper reports a novel approach yielding observations that offer tantalizing insight into the mycobacterial factors which might influence TB treatment outcomes independent of drug resistance, however, the following must be considered:

    (i) The manuscript provides little to no detail about how the samples were obtained, other than the fact that they comprise "pre-treatment" samples: are they all sputum samples? Were they induced? Similarly, no information is provided about sample propagation: were the samples cultured to achieve sufficient biomass for whole-genome sequencing? If so, in what growth media, for how long, and how many passages? Were all samples treated identically? And were they plated to single colonies - or are the "isolates" referred to throughout the manuscript actually heterogenous populations of potentially different Mtb clones obtained - and propagated - as a mixed sample? This information is critical given the potential that the identified polymorphisms - both fixed and (perhaps even more so) unfixed - might have arisen as a consequence of in vitro (laboratory) manipulation under standard aerobic conditions.

    Thanks for your encouraging comments. The requested information about sample propagation has been added to the methods section in the new version. For details, please see our response, above, to the essential revisions (Q1).

    (ii) A key question that arises from this study (and others like it) is whether causation has been adequately established. Ideally, the Mtb genotypes contained within samples obtained pre-treatment should be compared with samples obtained from the same patients following treatment - that is, when the "poor" outcome was manifest. The expectation is that the polymorphisms identified prior to initiation of therapy - especially the 14 fixed mutations - should be evident (even dominant) at the later stage when therapy failed (or at the subsequent presentation in cases of relapse). Recognizing that this is not easily accomplished, though, it seems fair to suggest that the perceived relevance of the identified mutations would be strengthened if the authors were able to provide any other evidence - perhaps from studies of drug-resistant Mtb isolates - supporting their inferred role in undermining frontline treatment.

    Thank you for these insightful questions. We sequenced the isolates obtained at the time of relapse for all 47 relapse cases and found that the 14 GWAS-identified fixed mutations were only detected in relapse isolates from the 13 patients whose first samples also contained the GWAS-identified mutations. None of the 14 mutations we identified were found in isolates from the other relapsed patients. We also searched for the presence or absence of theses 14 mutations in published studies seeking noncanonical mutations associated with drug-resistant Mtb isolates [5-7]. None of the 14 mutations we identified were reported in any of these studies, but two of the genes (ctpB & metA) in which our mutations were found had been previously identified as potentially associated with first-line drug resistance.

    (iii) Related to the above, the authors make the valid point that their intention here was different from other studies which have deliberately utilized drug-resistant Mtb isolates to identify resistance-conferring and resistance-enabling mutations (such as in the study they cite by Hicks et al). It would be interesting to know, however, if any of the mutations identified in those other studies were also picked up in this work - and, if not, why that might be the case.

    As mentioned in our response to the previous question, none of our mutations were mentioned in prior studies. Our inference is that the 14 fixed mutations we identified had only limited effects on outcomes, which would explain why: they were not identified in previous studies; isolates from only 24.2% (22/91) of patients carried any of these 14 mutations; and none of the mutations were shared amongst all 22 patients.

    (iv) Finally, the analyses presented in this study are heavily dependent on the use of appropriate statistical methods to identify potentially rare genetic polymorphisms. However, as noted for sample processing (see my earlier comment above), there is very little detail provided about the methodology applied. This omission detracts from the interpretation, especially given that the predominance of lineage 2 (which contributes >75% of the isolates, with sublineage 2.3 constituting >50%) risks a lineage-specific association, rather than a more generalizable pathogenicity phenotype. Similarly, the heavy skew in the numbers of "good" (3105 samples) versus "poor" (91 samples) collections (approximately 34x difference in sample size) raises the possibility that mutations identified in the "poor" category might be artificially over-represented. More clarity in detailing the statistical methods is required to allay any concerns about the identification of candidate polymorphisms.

    Thank you for pointing this out. We have added details of our statistical methods to the methods section, and in the results section we have indicated the specific statistical methods used and the meaning of the statistical metrics.

  2. eLife assessment

    In this useful study, a GWAS-type analysis is applied to clinical Mycobacterium tuberculosis isolates to discover genetic polymorphisms linked to poor tuberculosis outcomes. The evidence for the detected associations is still incomplete, as the corresponding polymorphisms are not adequate to power a prediction model for infection outcome, although key host factors - including patient age, sex, and duration of diagnostic delay (which have stronger predictive value) - appear to enhance predictive capacity.

  3. Reviewer #1 (Public Review):

    The authors aimed to study the contribution of bacterial factors to poor treatment outcomes in drug-susceptible TB, an important issue that has not been well studied. The authors performed GWAS on a very large population-based (3 sites in China) dataset of 3416 Mtb WGS data of pre-treatment isolates linked with clinical data to predict treatment outcomes. Logistic regression was used to assess the association between predictors and outcomes and ROC curves were generated to assess the value of the genomic signatures to predict poor TB treatment outcomes. The authors were successful in identifying 14 Mtb variants in 13 genes and reactive oxygen species that were more likely to occur in patients with poor treatment outcomes.

    The investigators were very thorough, in investigating both fixed and unfixed mutations, and analyzing the changes in gene expression under stress (exposure to first-line drugs and hypoxic conditions) for the 13 genes identified, which further strengthened the evidence generated by GWAS. The authors attempted to perform an external validation of their findings but could not identify a suitable existing dataset.

    These data can be used by others to guide their analyses, and confirm if these 13 genes are also found in other settings. If confirmed, then the results could open the possibility for individualised tailoring of treatment of drug-susceptible TB, especially to prevent the risk of relapse.

  4. Reviewer #2 (Public Review):

    The availability of large collections of Mycobacterium tuberculosis (Mtb) isolates has enabled many important studies looking to identify mycobacterial genetic polymorphisms associated with anti-tuberculosis (TB) drug resistance, including both classical "resistance-conferring" mutations and novel "resistance-enabling" mutations. Importantly, these studies have expanded our understanding of mycobacterial genetic adaptations undermining chemotherapy, in many cases allowing for improved diagnostic tests and predictions of treatment failure. In this submission, Gao and colleagues adopt a different approach to the problem: although also applying a GWAS-type analysis, they instead attempt to elucidate polymorphisms implicated in poor outcomes of TB patients undergoing treatment for the drug-susceptible disease. Starting with a large dataset comprising 3496 samples with corresponding clinical (host) metadata, the authors generate Mtb whole-genome sequence data for 91 samples obtained from patients with "poor" outcomes and 3105 patients with "good" outcomes. These are used to identify 14 fixed and >230 unfixed mutations that might be associated with "poor" treatment outcomes, a conclusion which they argue is plausible given transcriptional evidence implicating many of the identified genes in the mycobacterial response in vitro to first-line drug exposure and/or hypoxia, both of which are considered relevant to clinical disease. Notably, they also identify a tendency for a greater proportion of "ROS mutational signatures" in unfixed mutations from "poor" outcome samples. Finally, incorporating these observations in a prediction model, the authors observe that the mycobacterial factors aren't adequate on their own but, when combined with key host factors - including patient age, sex, and duration of diagnostic delay (which have stronger predictive value) - they enhance predictive capacity. In summary, this paper reports a novel approach yielding observations that offer tantalizing insight into the mycobacterial factors which might influence TB treatment outcomes independent of drug resistance, however, the following must be considered:

    (i) The manuscript provides little to no detail about how the samples were obtained, other than the fact that they comprise "pre-treatment" samples: are they all sputum samples? Were they induced? Similarly, no information is provided about sample propagation: were the samples cultured to achieve sufficient biomass for whole-genome sequencing? If so, in what growth media, for how long, and how many passages? Were all samples treated identically? And were they plated to single colonies - or are the "isolates" referred to throughout the manuscript actually heterogenous populations of potentially different Mtb clones obtained - and propagated - as a mixed sample? This information is critical given the potential that the identified polymorphisms - both fixed and (perhaps even more so) unfixed - might have arisen as a consequence of in vitro (laboratory) manipulation under standard aerobic conditions.

    (ii) A key question that arises from this study (and others like it) is whether causation has been adequately established. Ideally, the Mtb genotypes contained within samples obtained pre-treatment should be compared with samples obtained from the same patients following treatment - that is, when the "poor" outcome was manifest. The expectation is that the polymorphisms identified prior to initiation of therapy - especially the 14 fixed mutations - should be evident (even dominant) at the later stage when therapy failed (or at the subsequent presentation in cases of relapse). Recognizing that this is not easily accomplished, though, it seems fair to suggest that the perceived relevance of the identified mutations would be strengthened if the authors were able to provide any other evidence - perhaps from studies of drug-resistant Mtb isolates - supporting their inferred role in undermining frontline treatment.

    (iii) Related to the above, the authors make the valid point that their intention here was different from other studies which have deliberately utilized drug-resistant Mtb isolates to identify resistance-conferring and resistance-enabling mutations (such as in the study they cite by Hicks et al). It would be interesting to know, however, if any of the mutations identified in those other studies were also picked up in this work - and, if not, why that might be the case.

    (iv) Finally, the analyses presented in this study are heavily dependent on the use of appropriate statistical methods to identify potentially rare genetic polymorphisms. However, as noted for sample processing (see my earlier comment above), there is very little detail provided about the methodology applied. This omission detracts from the interpretation, especially given that the predominance of lineage 2 (which contributes >75% of the isolates, with sublineage 2.3 constituting >50%) risks a lineage-specific association, rather than a more generalizable pathogenicity phenotype. Similarly, the heavy skew in the numbers of "good" (3105 samples) versus "poor" (91 samples) collections (approximately 34x difference in sample size) raises the possibility that mutations identified in the "poor" category might be artificially over-represented. More clarity in detailing the statistical methods is required to allay any concerns about the identification of candidate polymorphisms.