CagY sequence and structural motifs are associated with ancestry and disease in world-wide Helicobacter pylori strains
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Helicobacter pylori ( Hp ) interacts with gastric epithelial cells using a multi-protein type 4 secretion system (T4SS) whose function is regulated by CagY. CagY is the T4SS backbone, connecting the bacterial cytoplasm with the epithelial cytoplasm and interacting with most other T4SS proteins. However, its mechanisms of action are unknown. We aimed to comprehensively analyse cagY in a worldwide collection of strains to unmask any correlation between ancestry and/or disease and sequence diversity, including an analysis of critical domains and associated structural changes. We used 674 cagY sequences derived from1,012 Hp genomes, sequenced with Single Molecule Real-Time from the Hp Genome Project. Phylogenetic and principal component analyses (PCA) revealed a strong population structure among European, Asian, African and American clusters, which interfered with other analyses. To address this heterogeneity, dimensional reduction analysis with PCA and Linear Discriminant Analyses were used to minimize interference before studying possible association of cagY diversity with disease. For this analysis, cagY genes were fragmented using unitig-caller with a designed analytical pipeline that clearly separated the non-atrophic gastritis (NAG), advanced intestinal metaplasia and gastric cancer (GC) groups. Sixty-four unitigs significantly separated GC from NAG based on cagY sequences, and the most GC-associated unitigs were localized in the middle repeat region of the protein (MRR), particularly in the A module, including the cysteine-containing motif YLDCVSQ. Proline was enriched in the MRR and VirB10 regions. The consequences of the diversity on the protein structure were analysed using ChimeraX. The validated and stabilized 14-multimer model showed major structural changes along the multimer-structure, particularly in the cysteine-rich MRR region, that might be one mechanism by which CagY regulates T4SS function.