Single-molecule variation in telomeric sequence and structure across humans
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The repetitive architectures of telomeric and subtelomeric regions have obscured studies of their genetic variation and chromatin organization across the human population. Here, we integrate near-complete diploid genome assemblies from 212 individuals with matched long-read sequencing data to construct an atlas of 316,146 telomere-spanning molecules across 12,080 chromosome-end-resolved telomere arrays. This atlas reveals that nearly every chromosome end harbors a structured and unique pattern of telomere variant repeats (TVR), or TVR code, with subtelomere-proximal TVR codes being heritable, somatically stable, and influenced by subtelomeric TAR1 regulatory elements. Despite ongoing cycles of telomere shortening and elongation in the germline, proximal TVR codes are maintained across the human population. These TVR codes expose rare telomerase-independent events that lengthen telomeres in the germline, including interchromosomal telomere exchange and recurrent internal duplications within telomere arrays. Furthermore, single-molecule chromatin fiber sequencing across 26,972 molecules spanning the telomere-subtelomere boundary confirms that TVR-rich regions adopt telomeric chromatin but introduce discrete discontinuities into otherwise compact telomeric chromatin fibers. Together, our results link chromosome-end sequence variation to telomere cap formation and telomerase-independent telomere extension mechanisms in the human germline.