Life identification number (LIN) codes for the genomic taxonomy of Corynebacterium diphtheriae strains

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Corynebacterium diphtheriae , which causes diphtheria, remains a public health concern especially in regions with low vaccination coverage. While advances in genomic typing, such as core-genome Multi-Locus Sequence Typing (cgMLST, based on 1305 genes), have improved our ability for strain identification, a standardized and stable genomic taxonomy is still lacking. This study aimed to establish a consistent classification and nomenclature for C. diphtheriae strains using cgMLST-based Life Identification Number (LIN) codes.

Methods

Comparing 1,665 genomes from C. diphtheriae and its closely related species C. belfantii and C. rouxii , we observed population-level genetic discontinuities in cgMLST profiles dissimilarities, and established hierarchical taxonomic levels based on optimal allelic difference thresholds. Ten-level LIN codes were defined, encompassing broad population structure subdivisions and fine-scale epidemiological levels. The LIN code system was implemented into the BIGSdb-Pasteur platform, and nicknames derived from the 7-loci MLST sequence types were given to sublineages and clonal groups.

Results

cgMLST genetic thresholds were first defined at species (minimum of 1,240 allelic differences) and lineage levels (1,035 differences). Sublineages (SL), clonal groups (ClG), and genetic clusters (GC) were next defined with progressively finer allelic mismatch thresholds (500, 55, and 25 differences, respectively). A broad population diversity of C. diphtheriae was uncovered, with the distinction of >400 SLs and >1,000 GCs. For epidemiological purposes, five shallow-level thresholds (8, 4, 2 ,1, and 0 allelic mismatches were defined, completing the 10-level LIN code taxonomy. We illustrate LIN codes applicability to investigate the genetic diversity and transmission chains of relevant clusters, such as SL8 (the 1990s ex-USSR outbreak) or SL384 (involved in outbreaks in Yemen and Europe).

Conclusions

The cgMLST-based LIN code system provides a stable genomic taxonomy for strains of C. diphtheriae, C. rouxii and C. belfantii . By defining ten hierarchical levels of resolution, this system effectively captures its phylogenetic diversity, facilitating population biology research and epidemiological surveillance. The public availability of this system from the BIGSdb-Pasteur platform provides a standardized framework for diphtheria genomic epidemiology with potential to harmonize global surveillance of the resurgence of diphtheria.

Article activity feed