Approaching an Error-Free Diploid Human Genome Assembly of East Asian Origin

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Achieving an error-free diploid human genome remains challenging. We report T2T-YAO v2.0, a telomere-to-telomere complete assembly of a Han Chinese individual, polished to near-perfect base-level and structural accuracy. To systematically identify assembly errors, we developed Sufficient Alignment Support (SAS), an automatic method that flags structural and base-level errors in windows lacking sufficient read support. Building on this, we established a “structural–error–first” polishing strategy, correcting misassemblies using ultra-long ONT reads, followed by base-level refinement with PWC (Platform-integrated Window Consensus). Using these approaches, we resolved all detectable structure and non-homopolymer-related errors outside ribosomal DNA (rDNA) regions. The resulting assembly contains no unsupported 21-mers across sequencing platforms, meeting k-mer–based criteria for an error-free genome. T2T-YAO v2.0 delivers the most perfect East Asian reference to date, with limited issues confined to rDNA arrays and homopolymer tracks, enabling precise genome annotation, benchmarking, and variant discovery—foundation for human genomics and precision medicine.

Article activity feed