Human gut microbiota subspecies carry implicit information for in-depth microbiome research
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Microbial strains from same species can have distinct functional characteristics owing to their different gene content. As the highest resolution, strains are mainly host-specific, thus obscuring unbiased associations, and hindering deductive research. Here, we comprehensively define the human gut microbiota at consistently-annotated subspecies resolution in an unbiased, cohort-independent manner, and demonstrate that we can generalize across distinct populations worldwide while maintaining specificity and improving interstudy reproducibility. We developed panhashome, a sketching-based method for rapid subspecies quantification and identification of genes that drive the intraspecies variations, and showed that subspecies carry implicit information undetectable at species level. By meta-analysis of colorectal cancer (CRC) datasets, we identified disease-associated subspecies whose sibling subspecies or species are not. Subspecies-based machine-learning CRC diagnostic algorithm outperformed species-level methods by leveraging the unique subspecies-level information. This subspecies catalogue allows identification of genes that drive the functional differences between subspecies as fundamental step in mechanistically understanding microbiome-phenotype interactions.