Genomic and Patient Epidemiology of Streptococcus dysgalactiae Subspecies equisimilis in Houston, Texas
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Many countries have reported increased human infections caused by Streptococcus dysgalactiae subspecies equisimilis (SDSE) in the last 15 years. However, there is scant molecular epidemiology data for SDSE in the United States, especially at the whole genome level. To address this knowledge deficit, we studied SDSE infections in a large health care system in the Houston, Texas, metroplex, an ethnically diverse population of 7.4 million. We used Illumina whole genome sequencing to characterize 865 human isolates collected consecutively from unique patients with diverse infections in the Houston Methodist Hospital system between June 2022 and August 2024. Genomic clustering assigned the isolates to 44 distinct genetic lineages (GL). GLs had a stronger correlation with the population structure as reflected by the single nucleotide polymorphism phylogenetic tree than emm or multilocus sequence type. We found absolute recombination nearly twice as prevalent in SDSE as in Streptococcus pyogenes , its closest genetically related species. This finding may explain why in contrast to S. pyogenes , emm typing is a poor molecular marker of overall genomic relationships in SDSE. We identified significant isolate genotype–patient phenotype associations: GL01 was associated with skin or soft tissue infections, GL02 with blood and urine infections, whereas GL03 was associated with throat infections. GL02 was also associated with increased clinical severity. In the aggregate, our study provides new information about SDSE infections occurring in a large metropolitan area in the United States, and expands our genomic epidemiology understanding of SDSE, as an emerging human bacterial pathogen of increasingly recognized importance.
IMPORTANCE
Our study provides considerable new information about the genomic epidemiology and patient characteristics of SDSE infections in a large metropolitan area in the United States. We discovered that all abundantly occurring genetic lineages were comprised of isolates with multiple emm gene types and multilocus sequence types. Analyses based solely or predominately on these two commonly employed molecular epidemiologic markers obscure a detailed understanding of SDSE genetic diversity, population genomics, and may fail to reveal important disease associations. Our work highlights the need for longitudinal SDSE whole genome sequencing-based surveillance and analysis of this emerging human pathogen. Such efforts will contribute to enhanced epidemiologic understanding and patient demographics, and may aid improved diagnostics, infection control, public health strategies, and vaccine development for a pathogen that disproportionately affects older patients and patients with underlying medical conditions. These at-risk populations are currently rapidly expanding in the United States and many other high-income countries.