Advancing Chlamydia trachomatis genomic surveillance and research with a novel core-genome MLST (cgMLST) approach
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Chlamydia trachomatis is the most common sexually transmitted bacterial infection, with an estimated 129 million new cases annually. Its classification traditionally relies on ompA -genotyping, but whole-genome sequencing (WGS) offers transformative resolution to study evolution, transmission dynamics and epidemiological patterns. Yet, WGS-based surveillance of C. trachomatis remains very limited by technical challenges and the lack of standardized typing frameworks. Core-genome multilocus sequence typing (cgMLST) is a scalable and portable approach widely applied to bacterial pathogens, but remains little explored for C. trachomatis . In this context, we compiled and curated the largest C. trachomatis genome dataset to date (1230 samples from 26 countries), including publicly available and newly generated assemblies, to develop a novel cgMLST schema optimized for standardized local deployment. Fueled by existing (like ReporTree) and newly developed bioinformatic resources, the extensive cgMLST analyses performed in this study allowed an in-depth and unprecedented exploration of C. trachomatis global phylogenomic diversity and recombination-driven evolution. Indeed, the novel cgMLST schema (n = 846 loci) robustly recapitulated the four major evolutionary lineages of C. trachomatis and showed high congruence with core-SNP approaches, while providing high resolution to resolve intra-lineage genogroup diversity and detect recombination mosaicisms. Also, it efficiently captured the clonal expansion of epidemiologically relevant strains, including the lymphogranuloma venereum (LGV) epidemic “L2b” and the emergent L4 strains, further consolidating its robustness for contemporary transmission and outbreak monitoring. By enabling a rapid link between loci/alleles and specific phylogenomic/phenotypic traits, the novel cgMLST approach not only elucidated C. trachomatis genome-wide recombination landscape (e.g., through straightforward detection of major genotype-lineage incongruences), but also identified lineage-specific alleles (and disrupted loci) with potential diagnostic and/or functional relevance. Finally, to further advance C. trachomatis genomic surveillance and research, this novel schema is released (https://doi.org/10.5281/zenodo.17177579) accompanied by a hierarchical cgMLST-based nomenclature that supports harmonized genogroup tracking across laboratories and countries. In summary, this work delivers both an expanded global C. trachomatis genomic resource and a robust cgMLST framework, with immediate utility for research and standardized, high-resolution genome-scale routine surveillance. *Zohra Lodhia & Verónica Mixão contributed equally to this work.