Pneumococcal genetic variability in age-dependent bacterial carriage

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    Strain variability in bacterial infections is a confounding factor in the treatment and prevention of the associated diseases. Pneumococcal disease is widespread, and the current vaccine targets only a subset of circulating strains, with disease and vaccine efficacy likely varying with the age of the host. Using two large databases of pneumococcal genomes, this study explores the associations between genomic factors and the age of the human host. Ultimately, these data and related studies will establish whether and how vaccines should be differentially designed for children and the elderly. This work will be of interest to those working in bacterial infections and host-pathogen genomics.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The characteristics of pneumococcal carriage vary between infants and adults. Host immune factors have been shown to contribute to these age-specific differences, but the role of pathogen sequence variation is currently less well-known. Identification of age-associated pathogen genetic factors could leadto improved vaccine formulations. We therefore performed genome sequencing in a large carriage cohort of children and adults and combined this with data from an existing age-stratified carriage study. We compiled a dictionary of pathogen genetic variation, including serotype, strain, sequence elements, single-nucleotide polymorphisms (SNPs), and clusters of orthologous genes (COGs) for each cohort – all of which were used in a genome-wide association with host age. Age-dependent colonization showed weak evidence of being heritable in the first cohort ( h 2 = 0.10, 95% CI 0.00–0.69) and stronger evidence in the second cohort ( h 2 = 0.56, 95% CI 0.23–0.87). We found that serotypes and genetic background (strain) explained a proportion of the heritability in the first cohort ( h 2 serotype = 0.07, 95% CI 0.04–0.14 and h 2 GPSC = 0.06, 95% CI 0.03–0.13) and the second cohort ( h 2 serotype = 0.11, 95% CI 0.05–0.21 and h 2 GPSC = 0.20, 95% CI 0.12–0.31). In a meta-analysis of these cohorts, we found one candidate association (p=1.2 × 10 -9 ) upstream of an accessory Sec-dependent serine-rich glycoprotein adhesin. Overall, while we did find a small effect of pathogen genome variation on pneumococcal carriage between child and adult hosts, this was variable between populations and does not appear to be caused by strong effects of individual genes. This supports proposals for adaptive future vaccination strategies that are primarily targeted at dominant circulating serotypes and tailored to the composition of the pathogen populations.

Article activity feed

  1. Evaluation Summary:

    Strain variability in bacterial infections is a confounding factor in the treatment and prevention of the associated diseases. Pneumococcal disease is widespread, and the current vaccine targets only a subset of circulating strains, with disease and vaccine efficacy likely varying with the age of the host. Using two large databases of pneumococcal genomes, this study explores the associations between genomic factors and the age of the human host. Ultimately, these data and related studies will establish whether and how vaccines should be differentially designed for children and the elderly. This work will be of interest to those working in bacterial infections and host-pathogen genomics.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)

  2. Reviewer #1 (Public Review):

    In this study the authors test whether and how serotype, genomic background, and genetic features are associated with host age in pneumococcal disease.

    The strength of the work lies in the high-quality analyses and the large dataset of whole genome sequences. The dataset has >4,000 genomes collected from infants and adults from a vaccinated Dutch dataset and an unvaccinated Thai dataset. The sets do not have much overlap regarding their genomic makeup.

    Authors find that within each group, there is a solid signal for genetic background. Specifically, when the data is plotted by serotypes and sequence types they find association with age (however, these differ between cohorts). Further the association with serotype was also observed from additional analyses investigates genes associated with carriage age. Together these data suggest that serotype and/or their genomic context, are associated with age.

    The authors also investigated whether any genetic variations are associated with age. Their analysis was not dependent on presence/absence alone, but also considered variations in the genome. The signal did not reveal a clear set of genomic regions that likely influence the molecular mechanisms of disease in an age-dependent manner. Nonetheless, the association of an adhesin factor with age deserves further consideration.

    Overall, the study suggests that age may be a consideration for vaccine design. Similar studies in additional datasets are warranted, and a bioinformatic framework for such studies is presented.

  3. Reviewer #2 (Public Review):

    The authors looked for pneumococcal traits specific to host-age, by comparing pneumococcal genomics in carriage isolates from infants to that of their parents in the Netherlands and Maela. The authors report that host age was to some extend explained by pneumococcal genetic variability. Items to evaluate are potential sampling bias and the conclusions inferred from the data.

    Clear strengths of this study are an interdisciplinary team, robust bioinformatics analyses, and the large study populations.

    It is unclear what differences between pneumococcal isolates from infants and their parents were expected. The study design may be better motivated.

    Whether the results support the conclusion, depends on certain methodological aspects that require additional clarification. 1) An overview of the cohorts (in terms of percentage of carriers + degree of parent-child relatedness between pneumococcal isolates) is necessary to interpret the results. 2) Inclusion of repeat samples from infants and/or parents would mimic overrepresentation of genetic variants in a category.

    In the discussion the results should be put into context of previous pneumococcal GWAS studies that reported on relation to host age and to geography. In addition, it would be nice if alternative explanations for the observations and claimed causality would be evaluated.

    Provided high quality, the Dutch pneumococcal carriage genomes would add a rich source of data in the field. Because both serotype as well as lineage predicted host age to some degree, based on this study the necessity of a capsular polysaccharide-based vaccine-target seems not that evident. And even if specific serotypes (capsular polysaccharides) are targeted, these often co-occur with specific proteins. If the authors could demonstrate a stratified vaccination strategy for the populations involved, that would support their conclusion.

  4. Reviewer #3 (Public Review):

    The goal of this study is to determine association between pneumococcal genome sequence variations with host age. The authors performed whole genome sequencing analysis of 4320 samples isolated from infants and adults from the Netherlands and Myanmar. While the manuscript is well written, it falls short of readily understandable data presentation and conclusive findings, which will hamper the translation of the sequencing data into the understanding of pneumococcal carriage dynamics and population-based vaccine design.

    Strengths:

    1. Large sizes of pneumococcal carriage isolates from child and adult populations in two countries. The authors performed admirable whole genome sequencing of 1,329 pneumococcal isolates from the adult and Dutch cohort and 3,085 isolates from the Myanmar cohort. This should represent the largest sample size in any of this kind studies on pneumococcal carriage.

    2. Whole genome sequencing analysis of large numbers of bacterial strains. This study undertook genome sequencing analysis of 4320 pneumococcal strains, and presents a comprehensive set of data.

    3. Identification of the Sec-dependent serine-rich glycoprotein adhesin locus as an association candidate. Since the function of this locus has not been well characterized, this information is highly valuable for future investigation of pneumococcal differential carriage in child and adult populations.

    Weaknesses:

    1. The result presentation is too sketchy. While it is understandable that the sequencing data need to be compressed to a presentable format, essential information needs to be logically displayed for the sake of readers' understanding. As examples, the first section of the result section mentioned total numbers of isolates and serotypes from each cohort, but did not say how many of them were from children/adults. Figure 1 does have age information, but it is difficult to evaluate due to data transformation. Sequence clusters were mentioned without elaboration on what they mean. This style of data presentation may be readily comprehensive for sequencing gurus, but is hard to digest to the experimentalists like myself.

    2. There is a lack of experimental confirmation of any sequencing data. This manuscript is a nice example for traditional sequencing analysis of large pneumococcal carriage isolates. It is desirable for the authors to test the key finding to certain extents in the model systems - the Sec-dependent serine-rich glycoprotein adhesin locus as an association candidate.