The diploid reference genome of a human embryonic stem cell line

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Advances in DNA sequencing and assembly technologies are spurring a shift from haploid reference genomes to sample-specific diploid assemblies. Here, we generated the first telomere-to-telomere (T2T) diploid reference for the widely used human embryonic stem cell (hESC) line, H9 (WAe009-A). This haplotype-resolved assembly is highly accurate with comprehensive annotation of genes, segmental duplications, methylation, and chromatin conformation. Pangenomic and phased-locus inference point to H9’s mixed ancestry with a predominant European component. H9-specific genomic features include near-perfect telomeres ∼1.65-fold longer than other T2T assemblies, consistent with telomerase activity during pluripotency; chromosome 17 inversions that can predispose offspring to neurological syndromes; and expansions of ncRNA clusters, with overall genomic stability maintained despite extensive culturing. Mapping multi-omic datasets to the genome, we demonstrate the power of this resource for allele-specific, high-precision transcriptomic, genetic, and epigenetic analyses, with far-reaching implications for human development and disease.

Article activity feed