PIPETS: a statistically informed, gene-annotation agnostic analysis method to study bacterial termination using 3′-end sequencing

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Over the last decade the drop in short-read sequencing costs has allowed experimental techniques utilizing sequencing to address specific biological questions to proliferate, oftentimes outpacing standardized or effective analysis approaches for the data generated. There are growing amounts of bacterial 3′-end sequencing data, yet there is currently no commonly accepted analysis methodology for this datatype. Most data analysis approaches are somewhat ad hoc and, despite the presence of substantial signal within annotated genes, focus on genomic regions outside the annotated genes (e.g. 3′ or 5′ UTRs). Furthermore, the lack of consistent systematic analysis approaches, as well as the absence of genome-wide ground truth data, make it impossible to compare conclusions generated by different labs, using different organisms.

Results

We present PIPETS, (Poisson Identification of PEaks from Term-Seq data), an R package available on Bioconductor that provides a novel analysis method for 3'-end sequencing data. PIPETS is a statistically informed, gene-annotation agnostic methodology. Across two different datasets from two different organisms, PIPETS identified significant 3'-end termination signal across a wider range of annotated genomic contexts than existing analysis approaches, suggesting that existing approaches may miss biologically relevant signal. Furthermore, assessment of the previously called 3′-end positions not captured by PIPETS showed that they were uniformly very low coverage.

Conclusions

PIPETS provides a broadly applicable platform to explore and analyze 3′-end sequencing data sets from across different organisms. It requires only the 3′-end sequencing data, and is broadly accessible to non-expert users.

Article activity feed