High-coverage nanopore sequencing of samples from the 1000 Genomes Project to build a comprehensive catalog of human genetic variation

Jonas A. Gustafson
Sophia B. Gibson
Nikhita Damaraju
Miranda P.G. Zalusky
Kendra Hoekzema
David Twesigomwe
Lei Yang
Anthony A. Snead
Phillip A. Richmond
Wouter De Coster
Nathan D. Olson
Andrea Guarracino
Qiuhui Li
Angela L. Miller
Joy Goffena
Zachary B. Anderson
Sophie H.R. Storz
Sydney A. Ward
Maisha Sinha
Claudia Gonzaga-Jauregui
Wayne E. Clarke
Anna O. Basile
André Corvelo
Catherine Reeves
Adrienne Helland
Rajeeva Lochan Musunuri
Mahler Revsine
Karynne E. Patterson
Cate R. Paschal
Christina Zakarian
Sara Goodwin
Tanner D. Jensen
Esther Robb
The 1000 Genomes ONT Sequencing Consortium
University of Washington Center for Rare Disease Research (UW-CRDR)
Genomics Research to Elucidate the Genetics of Rare Diseases (GREGoR) Consortium
William Richard McCombie
Fritz J. Sedlazeck
Justin M. Zook
Stephen B. Montgomery
Erik Garrison
Mikhail Kolmogorov
Michael C. Schatz
Richard N. McLaughlin
Harriet Dashnow
Michael C. Zody
Matt Loose
Miten Jain
Evan E. Eichler
Danny E. Miller

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Fewer than half of individuals with a suspected Mendelian or monogenic condition receive a precise molecular diagnosis after comprehensive clinical genetic testing. Improvements in data quality and costs have heightened interest in using long-read sequencing (LRS) to streamline clinical genomic testing, but the absence of control data sets for variant filtering and prioritization has made tertiary analysis of LRS data challenging. To address this, the 1000 Genomes Project (1KGP) Oxford Nanopore Technologies Sequencing Consortium aims to generate LRS data from at least 800 of the 1KGP samples. Our goal is to use LRS to identify a broader spectrum of variation so we may improve our understanding of normal patterns of human variation. Here, we present data from analysis of the first 100 samples, representing all 5 superpopulations and 19 subpopulations. These samples, sequenced to an average depth of coverage of 37× and sequence read N50 of 54 kbp, have high concordance with previous studies for identifying single nucleotide and indel variants outside of homopolymer regions. Using multiple structural variant (SV) callers, we identify an average of 24,543 high-confidence SVs per genome, including shared and private SVs likely to disrupt gene function as well as pathogenic expansions within disease-associated repeats that were not detected using short reads. Evaluation of methylation signatures revealed expected patterns at known imprinted loci, samples with skewed X-inactivation patterns, and novel differentially methylated regions. All raw sequencing data, processed data, and summary statistics are publicly available, providing a valuable resource for the clinical genetics community to discover pathogenic SVs.

Version published to 10.1101/gr.279273.124
Oct 2, 2024
Version published to 10.1101/2024.03.05.24303792 on medRxiv
Mar 7, 2024

A Benchmarking Framework to Catalyze Individual Human Genome Projects

This article has 3 authors:
1. Manjushri kalpande
2. Apoorva Ganesh
3. Subhashini Srinivasan
This article has no evaluationsLatest version Dec 17, 2025
Benchmarking RNA-seq Tools for Real-World Diagnostic Applications

This article has 15 authors:
1. Sarah Silverstein
2. Kaushik Ganapathy
3. Sandra Donkervoort
4. Veronique Bolduc
5. Ying Hu
6. Justin Moy
7. Prech Uapinyoying
8. Svetlana Gorokhova
9. Vijay Ganesh
10. Ben Weisburd
11. Rotem OrBach
12. A. Reghan Foley
13. Pejman Mohammadi
14. David Adams
15. Carsten Bonnemann
This article has no evaluationsLatest version Jan 29, 2026
Capturing clinically actionable copy number alterations in Wilms tumor using nanopore sequencing

This article has 9 authors:
1. Larissa V. Furtado
2. Carolyn Jablonowski
3. Pandurang Kolekar
4. Teresa Santiago
5. Christopher L. Morton
6. Allison Woolard
7. Andrew M. Davidoff
8. Xiaotu Ma
9. Andrew J. Murphy
This article has no evaluationsLatest version Jan 25, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Benchmarking Framework to Catalyze Individual Human Genome Projects

Benchmarking RNA-seq Tools for Real-World Diagnostic Applications

Capturing clinically actionable copy number alterations in Wilms tumor using nanopore sequencing