Patrilineages of ethnolinguistically diverse populations reveal multifactorial influences on Chinese paternal population stratification
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Large-scale Y-chromosome genetic resources provide critical insights into human evolutionary history. However, the limited high-density Y-chromosomal data from ethnolinguistically diverse Chinese populations hinder the reconstruction of fine-scale population stratification and the exploration of its complex influencing factors. We report large-scale Y-chromosome variation data from 5,311 unrelated males in the pilot phase of the 10K Chinese People Genomic Diversity Project. We identified clear north-south and west-east genetic substructures among Chinese populations, reflecting distinct regional genetic origins and migration patterns. We illuminate how multiple cultural and demographic factors, including subsistence strategy shifts, language barriers, and geographic isolation, have shaped Chinese paternal population dynamics via admixture modeling coupled with phylogenetic and phylogeographic analyses. Paternal genetic diversity follows complex patterns, with a haplogroup frequency spectrum and a variation-based phylogenetic tree indicating that more than 95% of paternal lineages belong to haplogroups O, C, N, D, and Q. The phylogeographical analysis revealed distinct regional haplogroup distribution patterns linked to subsistence strategy shifts and ancestral population dispersal. The predominance of Neolithic farmer-related lineages suggests that agriculture-related lineages promote population differentiation between ancient northern and southern East Asians. We observed significant lineage sharing between Han Chinese and minority ethnic groups, with the northwestern paternal gene pool contributing by farming and herding-related lineages. Spatial autocorrelation and principal component analyses emphasized genetic connections between Han Chinese and ethnic minorities, highlighting complex admixture and migration aligned with geographical and linguistic divisions. These findings support the influence of the farming-language dispersal hypothesis on Chinese paternal lineage formation and underscore the role of geographic and linguistic isolation in shaping the genetic landscape. This study demonstrates the unique value of large-scale Y-chromosome data in uncovering human evolutionary complexity.