Complete and haplotype-resolved maps of genomic and epigenetic discordance in monozygotic twins
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Telomere-to-telomere (T2T) genome assemblies are indispensable for accurate detection of genetic variation and for resolving complex repetitive regions. Monozygotic (MZ) twin pedigrees provide a powerful model to investigate de novo mutations (DNMs), however, comprehensive, haplotype-resolved analyses of structural variation (SV), allele-specific inheritance in complex regions, and DNA methylation in diploid human genomes remain limited. Here, we generated complete, haplotype-resolved T2T assemblies for two female twins (C33 and C35) from a Han Chinese pedigree by integrating complementary, state-of-the-art sequencing technologies. The resulting T2T-C33 and T2T-C35 assemblies are highly contiguous and complete, with Genome Continuity Inspector (GCI) scores of 74.94 (maternal) and 77.94 (paternal), and consensus quality values (QV) >75 ( k = 21). We comprehensively cataloged 62 inter-twin single-nucleotide variants (SNVs), 15 small indels, and identified both shared and private DNMs, revealing nascent genomic divergence between the MZ twins. Focused interrogation of complex regions uncovered pronounced haplotype-specific length polymorphisms and structural heterogeneity within centromeric higher-order repeat (HOR) arrays. Notably, we observed extensive HOR copy-number variation between haplotypes, including a large copy-number difference on maternal chromosome 18, underscoring dynamic HOR array evolution even among genetically identical individuals. Concurrently, genome-wide DNA methylation profiling delineated allele-specific epigenetic variation that may contribute to phenotypic discordance. Together, these high-quality, diploid T2T genomes from a Han Chinese pedigree provide a valuable resource for population-aware genomics and reveal fine-scale, haplotype-specific divergence in MZ twins. Our results advance understanding of repeat dynamics, centromeric architecture, epigenetic variation and the spectrum of human genomic variation at single-base and structural scales.