SAVANA: reliable analysis of somatic structural variants and copy number aberrations in clinical samples using long-read sequencing

Hillary Elrick
Carolin M Sauer
Jose Espejo Valle-Inclan
Katherine Trevers
Melanie Tanguy
Sonia Zumalave
Solange De Noon
Francesc Muyas
Rita Cascao
Angela Afonso
Fernanda Amary
Roberto Tirabosco
Adam Giess
Timothy Freeman
Alona Sosinsky
Katherine Piculell
David T Miller
Claudia C Faria
Greg Elgar
Adrienne M Flanagan
Isidro Cortes-Ciriano

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate detection of somatic structural variants (SVs) and copy number aberrations (SCNAs) is critical to inform the diagnosis and treatment of human cancers. Here, we describe SAVANA, a computationally efficient algorithm designed for the joint analysis of somatic SVs, SCNAs, tumour purity and ploidy using long-read sequencing data. SAVANA relies on machine learning to distinguish true somatic SVs from artefacts and provide prediction errors for individual SVs. Using high-depth Illumina and nanopore whole-genome sequencing data for 99 human tumours and matched normal samples, we establish best practices for benchmarking SV detection algorithms across the entire genome in an unbiased and data-driven manner using simulated and sequencing replicates of tumour and matched normal samples. SAVANA shows significantly higher sensitivity, and 9- and 59-times higher specificity than the second and third-best performing algorithms, yielding orders of magnitude fewer false positives in comparison to existing long-read sequencing tools across various clonality levels, genomic regions, SV types and SV sizes. In addition, SAVANA harnesses long-range phasing information to detect somatic SVs and SCNAs at single-haplotype resolution. SVs reported by SAVANA are highly consistent with those detected using short-read sequencing, including complex events causing oncogene amplification and tumour suppressor gene inactivation. In summary, SAVANA enables the application of long-read sequencing to detect SVs and SCNAs reliably in clinical samples.

Version published to 10.1101/2024.07.25.604944 on bioRxiv
Jul 25, 2024

Capturing clinically actionable copy number alterations in Wilms tumor using nanopore sequencing

This article has 9 authors:
1. Larissa V. Furtado
2. Carolyn Jablonowski
3. Pandurang Kolekar
4. Teresa Santiago
5. Christopher L. Morton
6. Allison Woolard
7. Andrew M. Davidoff
8. Xiaotu Ma
9. Andrew J. Murphy
This article has no evaluationsLatest version Jan 25, 2026
Integrative benchmarking and automation of clonal reconstruction of somatic mutations in single-sample tumor genome analysis

This article has 3 authors:
1. Marina Masliakova
2. Steve Lefever
3. Jo Vandesompele
This article has no evaluationsLatest version Jan 21, 2026
Benchmarking RNA-seq Tools for Real-World Diagnostic Applications

This article has 15 authors:
1. Sarah Silverstein
2. Kaushik Ganapathy
3. Sandra Donkervoort
4. Veronique Bolduc
5. Ying Hu
6. Justin Moy
7. Prech Uapinyoying
8. Svetlana Gorokhova
9. Vijay Ganesh
10. Ben Weisburd
11. Rotem OrBach
12. A. Reghan Foley
13. Pejman Mohammadi
14. David Adams
15. Carsten Bonnemann
This article has no evaluationsLatest version Jan 29, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Capturing clinically actionable copy number alterations in Wilms tumor using nanopore sequencing

Integrative benchmarking and automation of clonal reconstruction of somatic mutations in single-sample tumor genome analysis

Benchmarking RNA-seq Tools for Real-World Diagnostic Applications