gcSV: a unified framework for comprehensive structural variant detection

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The characterization of structural variants (SVs) is fundamental to genomic studies and advanced computational approaches are on demand to exert the ability of the ubiquitous high-throughput sequencing data. Herein, we propose gcSV, a read-length agnostic alignment-based approach to well-handle the issues of genome repeats, SV breakpoints and read alignments/assemblies for comprehensive, cost-effective and versatile SV calling. For long reads, its yield is 20-38% higher than state-of-the-art tools in HG002 benchmark. For hybrid sequencing, it provides a cost-effective solution (2-4x long plus 30-60x short reads) to achieve even higher yield than that of state-of-the-art tools using 30x long reads. For short reads, gcSV also achieves over 8% higher precision without any loss of sensitivity. Furthermore, gcSV confidently brings over 93,000 novel SVs comparing to the official callset of 1000 Genomes Project Phase4. The results suggest that gcSV is promising to make valuable SV discoveries in many cutting-edge studies.

Article activity feed