dadasnake, a Snakemake implementation of DADA2 to process amplicon sequencing data for microbial ecology

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Background

Amplicon sequencing of phylogenetic marker genes, e.g. 16S, 18S or ITS rRNA sequences, is still the most commonly used method to determine the composition of microbial communities. Microbial ecologists often have expert knowledge on their biological question and data analysis in general, and most research institutes have computational infrastructures to employ the bioinformatics command line tools and workflows for amplicon sequencing analysis, but requirements of bioinformatics skills often limit the efficient and up-to-date use of computational resources.

Results

dadasnake wraps pre-processing of sequencing reads, delineation of exact sequence variants using the favorably benchmarked, widely-used the DADA2 algorithm, taxonomic classification and post-processing of the resultant tables, and hand-off in standard formats, into a user-friendly, one-command Snakemake pipeline. The suitability of the provided default configurations is demonstrated using mock-community data from bacteria and archaea, as well as fungi.

Conclusions

By use of Snakemake, dadasnake makes efficient use of high-performance computing infrastructures. Easy user configuration guarantees flexibility of all steps, including the processing of data from multiple sequencing platforms. dadasnake facilitates easy installation via conda environments. dadasnake is available at https://github.com/a-h-b/dadasnake .

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/giaa135

    Christina Weiβbecker 1Helmholtz Centre for Environmental Research GmbH - UFZ, Department of Soil EcologyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Christina WeiβbeckerBeatrix Schnabel 1Helmholtz Centre for Environmental Research GmbH - UFZ, Department of Soil EcologyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteAnna Heintz-Buschart 2German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Bioinformatics Unit1Helmholtz Centre for Environmental Research GmbH - UFZ, Department of Soil EcologyFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Anna Heintz-BuschartFor correspondence: anna.heintz-buschart@ufz.de

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa135 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102503 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102504