Accessing to additional diversity in Mycobacterium tuberculosis through long-read sequencing: Impact on redefinition of transmission clusters

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Whole-genome sequencing, supported on short-read-sequencing, has revolutionized the precision to track Mycobacterium tuberculosis (MTB) transmission. However, the high GC content (65%) and repetitive regions (10%) of the MTB genome challenge short-read mapping and assembly, leading to the exclusion of certain genomic regions from the analysis. Long-read-sequencing can overcome these limitations, giving access to these regions, generally uninterrogated. Our study aims to evaluate the potential of long-read sequencing in redefining long-term MTB transmission clusters, previously characterized by short-read sequencing. We selected 78 cases from eight long-term clusters (5–17 years; 7 to 16 cases), from a population-based genomic epidemiology program in Almería, Spain. The clusters were carefully selected to ensure cases i) infected by identical strains, ii) exhibiting pairwise-SNP-based distances from 1 to 16 SNPs and iii) distributed along different branches in the genomic networks. Long-read analysis increased the distances of each cluster from the reference by an average of 258 SNPs and intercluster distances by 113 SNPs. Within-cluster diversity also increased, with pairwise distances rising from 1 to 22 SNPs across 1–7 network branches. In one cluster, the acquisition of diversity led to overpass the 12-SNP threshold. Additionally, in four clusters, 1–2 cases previously classified as infected by identical strains were now reclassified due to the identification of additional SNP differences. Thanks to the identification of new diversity between the cases we could reconstruct transmission links and propose new epidemiological interpretations among the cases in cluster.

Article activity feed