SeQuiLa-cov: A fast and scalable library for depth of coverage calculations

This article has been Reviewed by the following groups

Read the full article

Abstract

Background

Depth of coverage calculation is an important and computationally intensive preprocessing step in a variety of next-generation sequencing pipelines, including the analysis of RNA-sequencing data, detection of copy number variants, or quality control procedures.

Results

Building upon big data technologies, we have developed SeQuiLa-cov, an extension to the recently released SeQuiLa platform, which provides efficient depth of coverage calculations, reaching >100× speedup over the state-of-the-art tools. The performance and scalability of our solution allow for exome and genome-wide calculations running locally or on a cluster while hiding the complexity of the distributed computing with Structured Query Language Application Programming Interface.

Conclusions

SeQuiLa-cov provides significant performance gain in depth of coverage calculations streamlining the widely used bioinformatic processing pipelines.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/giz094

    Marek Wiewiórka 1Institute of Computer Science, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, PolandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Marek WiewiórkaAgnieszka Szmurło 1Institute of Computer Science, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, PolandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Agnieszka SzmurłoTomasz Gambin 1Institute of Computer Science, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, PolandFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Tomasz Gambin

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz094 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101847 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101848 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.101849