Refgenie: a reference genome resource manager

This article has been Reviewed by the following groups

Read the full article

Abstract

Background

Reference genome assemblies are essential for high-throughput sequencing analysis projects. Typically, genome assemblies are stored on disk alongside related resources; e.g., many sequence aligners require the assembly to be indexed. The resulting indexes are broadly applicable for downstream analysis, so it makes sense to share them. However, there is no simple tool to do this.

Results

Here, we introduce refgenie, a reference genome assembly asset manager. Refgenie makes it easier to organize, retrieve, and share genome analysis resources. In addition to genome indexes, refgenie can manage any files related to reference genomes, including sequences and annotation files. Refgenie includes a command line interface and a server application that provides a RESTful API, so it is useful for both tool development and analysis.

Conclusions

Refgenie streamlines sharing genome analysis resources among groups and across computing environments. Refgenie is available at https://refgenie.databio.org.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/giz149

    Michal Stolarczyk 1Center for Public Health Genomics, University of VirginiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteVincent P. Reuter 1Center for Public Health Genomics, University of VirginiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNeal E. Magee 5Research Computing, University of VirginiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteNathan C. Sheffield 1Center for Public Health Genomics, University of Virginia2Department of Public Health Sciences, University of Virginia3Department of Biomedical Engineering, University of Virginia4Department of Biochemistry and Molecular Genetics, University of VirginiaFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Nathan C. SheffieldFor correspondence: nsheffield@virginia.edu

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz149 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102075 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102076 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.102077 Reviewer 4: http://dx.doi.org/10.5524/REVIEW.102078