Telomere-to-Telomere Accurate and Gapless Korean Standard Reference Genome

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We present KOREF1-G-TTAGGA, a Telomere-to-Telomere Accurate and Gapless Genome Assembly of KOREF1, representing a Korean standard reference genome. It was constructed utilizing 89 x PacBio HiFi, 298 x Oxford Nanopore Technology ultra-long reads (104 x > 100 kb), and parental short reads (Paternal: 37 x, Maternal: 40 x). The paternal and maternal haplotypes span 2.91 Gb and 3.03 Gb, respectively, successfully closing all unresolved gaps from the previous KOREF1 releases in 2016 and 2022. Notably, both phased haploid assemblies exhibit high base-level accuracy (Quality Value, QV: 81.19 and 79.03 corresponding to one error per 132 and 80 Mb) with minimal phasing errors (Switch error: 0.1%, 0.33%, respectively). Evaluations of read-to-assembly concordance revealed remarkably few structural errors (Assembly Quality Index, AQI: 99.77, 99.69, respectively, far surpassing the reference quality threshold of 90) with superior assembly continuity (Genome Continuity Inspector score: 84.92, 80.78 each), establishing KOREF1 as one of the most complete Asian reference genomes. Thus, it can serve as a foundational resource for constructing the future Korean Pangenome Reference. As a national multiomics reference initiative launched in 2006, KOREF1-G-TTAGGA is also accompanied by multiomic data of transcriptome, epigenome, proteome, and ATAC-seq from blood and cell lines.

Article activity feed