TATAT: a containerized software for generating annotated coding transcriptomes from raw RNA-seq data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Many transcriptome creation workflows are not standardized, are difficult to install or share, prone to breaking as dependencies update or cease to be maintained and are resource intensive. Due to a lack of authoritative literature, many also overlook potentially important steps, such as thinning contig over-assembly or identifying transcript consensus across samples, which reduce resource demands during annotation and increase the accuracy of final transcripts.

Results

We developed TATAT, a modular, Dockerized software that contains all the tools necessary to generate an annotated coding transcriptome from raw RNA-seq data. The tools remain in a static state and can be coordinated with bash and python scripts provided therein, making TATAT a standardized, reproducible workflow that can easily be shared and installed. We preferentially incorporate tools that are not only accurate, but are fast and require less RAM, and subsequently show TATAT can generate a comprehensive transcriptome for a non-model organism, the Egyptian rousette bat ( Rousettus aegyptiacus ), in ∼8 hours in a high-performance computing (HPC) environment.

Availability and implementation

The TATAT code, instructions, and tutorial are available at https://github.com/viralemergence/tatat .

Article activity feed