gcSV: a unified framework for comprehensive structural variant detection
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The characterization of structural variants (SVs) is fundamental to genomic studies and advanced computational approaches are on demand to exert the ability of the ubiquitous high-throughput sequencing data. Herein, we propose gcSV, a read-length agnostic alignment-based approach to well-handle the issues of genome repeats, SV breakpoints and read alignments/assemblies for comprehensive, cost-effective and versatile SV calling. For long reads, its yield is 20-38% higher than state-of-the-art tools in HG002 benchmark. For hybrid sequencing, it provides a cost-effective solution (2-4x long plus 30-60x short reads) to achieve even higher yield than that of state-of-the-art tools using 30x long reads. For short reads, gcSV also achieves over 8% higher precision without any loss of sensitivity. Furthermore, gcSV confidently brings over 93,000 novel SVs comparing to the official callset of 1000 Genomes Project Phase4. The results suggest that gcSV is promising to make valuable SV discoveries in many cutting-edge studies.