GenPipes: an open-source framework for distributed and scalable genomic analyses

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

With the decreasing cost of sequencing and the rapid developments in genomics technologies and protocols, the need for validated bioinformatics software that enables efficient large-scale data processing is growing.

Findings

Here we present GenPipes, a flexible Python-based framework that facilitates the development and deployment of multi-step workflows optimized for high-performance computing clusters and the cloud. GenPipes already implements 12 validated and scalable pipelines for various genomics applications, including RNA sequencing, chromatin immunoprecipitation sequencing, DNA sequencing, methylation sequencing, Hi-C, capture Hi-C, metagenomics, and Pacific Biosciences long-read assembly. The software is available under a GPLv3 open source license and is continuously updated to follow recent advances in genomics and bioinformatics. The framework has already been configured on several servers, and a Docker image is also available to facilitate additional installations.

Conclusions

GenPipes offers genomics researchers a simple method to analyze different types of data, customizable to their needs and resources, as well as the flexibility to create their own workflows.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/giz037

    Mathieu Bourgey 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFor correspondence: guil.bourque@mcgill.ca mathieu.bourgey@mcgill.caRola Dali 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRobert Eveleigh 1Canadian Centre for Computational Genomics, Montréal, QC, Canada.2McGill University and Genome Québec Innovation Center, Montréal, QC, Canada.Find this author on Google ScholarFind this author on …