Targeted sequencing and iterative assembly of near-complete genomes
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Advances in long-read sequencing (LRS) and assembly algorithms have made it possible to create highly complete genome assemblies for humans, animals and plants. However, ongoing development is needed to improve accessibility, affordability, and assembly quality and completeness. 'Cornetto' is a new strategy in which we use programmable selective nanopore sequencing to focus LRS data production onto the unsolved regions of a nascent assembly. This improves assembly quality and streamlines the process, both for humans and non-human vertebrates. Cornetto enables us to generate highly complete diploid human genome assemblies using only nanopore LRS data, surpassing the quality of previous efforts at a fraction of the cost. Cornetto enables genome assembly from challenging sample types like human saliva. Finally, we obtain accurate assemblies for clinically-relevant repetitive loci at the extremes of the genome, demonstrating valid approaches for genetic diagnosis in facioscapulohumeral muscular dystrophy (FSHD) and MUC1-autosomal dominant tubulointerstitial kidney disease (MUC1-ADTKD).