Community-Driven Copy Number Variant Discovery at Scale: Results from a Rare Disease Genomics Hackathon

Ming Yin Lun
Jennifer E. Posey
Jesse D. Bengtsson
Haowei Du
Rituparna Sinha Roy
Lei Yang
Sebastian Ochoa
Bo Yuan
Maddie Gillentine
Anna Lindstrand
Claudia M. B. Carvalho

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Purpose

Copy number variants (CNVs) are a major contributor to rare genetic diseases, but their detection and interpretation from short-read genome sequencing (srGS) data remain challenging, especially at scale. Large amounts of existing srGS data remain under-analyzed for clinically relevant CNVs.

Methods

During a collaborative Hackathon, we developed and applied scalable CNV analysis workflows to srGS data from three unsolved, exome-negative, rare disease cohorts: Primary Immunodeficiency (N = 39), Turkish developmental disorders (N = 31), and data from the Genomics Research to Elucidate the Genetics of Rare diseases (GREGoR) (N = 1437). We employed Parliament2 for structural variant (SV) calling, Mosdepth and SLMSuite for read-depth–based quality control and CNV detection, and R Shiny-based visualization tools. We also constructed an SV/CNV variant database with population frequency and pathogenicity annotations, applied DBSCAN clustering for internal allele frequency estimation, and used a 3-way annotation strategy to aid interpretation.

Results

Our pipelines identified high-confidence CNVs and streamlined interpretation across cohorts. Within 2 days, the Hackathon yielded 39 candidate pathogenic SVs. The tools and workflows enabled rapid filtering, prioritization, and visualization of clinically relevant variants.

Conclusion

This community-driven effort demonstrates the feasibility and utility of scalable CNV analysis for accelerating diagnosis and discovery in rare disease cohorts using srGS data.

Version published to 10.1101/2025.08.08.25333317 on medRxiv
Aug 12, 2025

Enhancing variant detection in complex genomes: leveraging linked reads for robust SNP, Indel, and structural variant analysis

This article has 7 authors:
1. Can Luo
2. Yichen Liu
3. Han Liu
4. Zhenmiao Zhang
5. Lu Zhang
6. Brock Peters
7. Xin Maizie Zhou
This article has no evaluationsLatest version Jan 12, 2026
Benchmarking RNA-seq Tools for Real-World Diagnostic Applications

This article has 15 authors:
1. Sarah Silverstein
2. Kaushik Ganapathy
3. Sandra Donkervoort
4. Veronique Bolduc
5. Ying Hu
6. Justin Moy
7. Prech Uapinyoying
8. Svetlana Gorokhova
9. Vijay Ganesh
10. Ben Weisburd
11. Rotem OrBach
12. A. Reghan Foley
13. Pejman Mohammadi
14. David Adams
15. Carsten Bonnemann
This article has no evaluationsLatest version Jan 29, 2026
Capturing clinically actionable copy number alterations in Wilms tumor using nanopore sequencing

This article has 9 authors:
1. Larissa V. Furtado
2. Carolyn Jablonowski
3. Pandurang Kolekar
4. Teresa Santiago
5. Christopher L. Morton
6. Allison Woolard
7. Andrew M. Davidoff
8. Xiaotu Ma
9. Andrew J. Murphy
This article has no evaluationsLatest version Jan 25, 2026

Discuss this preprint

Listed in

Abstract

Purpose

Methods

Results

Conclusion

Article activity feed

Related articles

Enhancing variant detection in complex genomes: leveraging linked reads for robust SNP, Indel, and structural variant analysis

Benchmarking RNA-seq Tools for Real-World Diagnostic Applications

Capturing clinically actionable copy number alterations in Wilms tumor using nanopore sequencing