Inferring Amazigh Genetic History through Proxy Populations: Insights from the 1000 Genomes Project
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Understanding the genetic structure of North African populations within a global context remains an essential yet understudied area of human population genomics. In this study, we analyzed a subset of individuals from the 1000 Genomes Project, including Iberian (IBS), Tuscan (TSI), Northern/Western European (CEU), and Yoruba (YRI) populations, to contextualize Amazigh-related ancestries using Chromosome 22 data. Using a filtered set of ~50,000 high-quality biallelic SNPs, we performed Principal Component Analysis (PCA), ADMIXTURE clustering, FST analysis, and Multidimensional Scaling (MDS). PCA revealed a clear continental split between African and European individuals, with minimal separation among European subpopulations. ADMIXTURE analysis (K=4) detected subtle intra-European components and significant African-European ancestry divergence, consistent with known demographic histories. Pairwise FST values confirmed these patterns, with low differentiation among European groups (FST ≈ 0.0016–0.0030) and much higher divergence from YRI (FST > ≈ 0.137–0.141). Ancestry proportions varied slightly by gender, though differences were not statistically significant. We further visualized inter-individual relatedness using phylogenetic trees and genetic distance matrices, which aligned with continental ancestries. Collectively, these findings underscore the genetic continuity among Southern European populations and their divergence from West African ancestry, providing a strong reference for future studies involving indigenous North African (Amazigh) genomic data. Our integrative, Python-based workflow demonstrates how publicly available datasets can illuminate population structure and support future North African-focused genome studies.