TATAT: a containerized software for generating annotated coding transcriptomes from raw RNA-seq data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
Many transcriptome creation workflows are not standardized, are difficult to install or share, prone to breaking as dependencies update or cease to be maintained and are resource intensive. Due to a lack of authoritative literature, many also overlook potentially important steps, such as thinning contig over-assembly or identifying transcript consensus across samples, which reduce resource demands during annotation and increase the accuracy of final transcripts.
Results
We developed TATAT, a modular, Dockerized software that contains all the tools necessary to generate an annotated coding transcriptome from raw RNA-seq data. The tools remain in a static state and can be coordinated with bash and python scripts provided therein, making TATAT a standardized, reproducible workflow that can easily be shared and installed. We preferentially incorporate tools that are not only accurate, but are fast and require less RAM, and subsequently show TATAT can generate a comprehensive transcriptome for a non-model organism, the Egyptian rousette bat ( Rousettus aegyptiacus ), in ∼8 hours in a high-performance computing (HPC) environment.
Availability and implementation
The TATAT code, instructions, and tutorial are available at https://github.com/viralemergence/tatat .