The Emirati T2T-Level Pangenome: A Graph of 58 Complete Genomes
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Reference data on genomic variation form the basis of genetic research. Limitations in identifying genetic variation from single reference sequences have been recently overcome, as improvements in sequencing technologies have allowed the generation of pangenomic references from multiple accurate, chromosome-level de novo assemblies. Here, we present a comprehensive Emirati telomere-to-telomere (T2T) pangenome generated from 58 individuals, comprising 28 trio-based and 30 single-sample assemblies. The resulting 116 haplotype-resolved assemblies demonstrate high contiguity, with a median continuity of 150 Mb and a median quality value (QV) of 59, achieving T2T-level scaffold status for 71.9% of chromosomes. These assemblies form the foundation of the Emirati T2T pangenome graph. The graph reveals levels of genomic diversity comparable to those reported by the Human Pangenome Reference Consortium, while also capturing regionally enriched and difficult-to-assemble variation, uniquely accessible through the Emirati T2T assemblies. This reference makes a valuable global contribution to human pangenomics and serves as a critical resource for advancing precision medicine in the United Arab Emirates.