Expanded gut microbial genomes from Chinese populations reveal population-specific genomic features related to human physiological traits
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background A comprehensive and representative reference database is crucial for accurate taxonomic and functional profiling of the human gut microbiome in population-level studies. However, with 70% of current microbial reference data derived from Europeans and the Americans, East Asia, especially China, remain underrepresented. Methods We constructed the human Gut Microbiome Reference (GMR), comprising 478,588 high-quality microbial genomes from Chinese (247,134) and non-Chinese (231,454) populations. Species-level clustering and protein annotations were performed to characterize microbial diversity and function. We further integrated novel genomes into taxonomic profiling database and validated the improvements using independent cohort data. Results The GMR dataset spans 6,664 species, including 26.4% newly classified species, and encodes over 20 million unique proteins, with 47% lacking known functional annotations. Notably, we observed that 35.35% and 32.46% of species unique to Chinese and non-Chinese populations, respectively. For 2,145 species shared between populations, 74% of 304 species with balanced prevalence between populations exhibited population-specific phylogenetic stratification, involving health relevant functionalities such as antibiotic resistance. Integration of novel genomes into taxonomic improved population-level species profiling by up to 23% and uncovered replicable associations between novel species and host physiological traits. Conclusions Our study largely expands the compositional and functional landscape of the human gut microbiome, providing a crucial resource for studying the role of gut microbiome for regional health disparities.