Large-scale reconstructions of Drosophila transcriptome identify ten thousands of new transcripts and transcription readthrough events

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Comprehensive transcriptome annotation is critical for understanding gene regulation, developmental plasticity, and evolutionary constraint. Here, we reconstruct a high-resolution transcriptome atlas of Drosophila melanogaster by integrating 33,186 high-quality RNA-seq libraries encompassing multiple developmental stages, tissues, and environmental conditions. The final assembly contains 398,168 transcripts, and includes 136,828 protein-coding mRNAs and 125,131 long non-coding RNAs (lncRNAs), identifying 4,515 previously unannotated protein-coding transcripts and 28,528 novel lncRNAs. Notably, we identify 49 transcripts which are highly expressed in more than 90% samples, named as super-housekeeping transcripts. Surprisingly, among them, 22 were annotated as uncharacterized transcripts. We then carried out structure-based annotation by comparing their structures with protein PDB100 database, showing that most of these super-housekeeping transcripts share highly conserved structure with proteins associate with ribosome production, mitochondrial respiration, and nucleic acid processing and 40 of them serve as hub genes across multiple co-expression network modules. In addition, we also identify 890,750 events of transcription readthrough in Drosophila , which are specially enriched in pupal metamorphosis and tended to be induced by irradiation, starvation, and high-fat diet. Together, these findings indicate that present understanding of Drosophila transcripts is still a tip of an iceberg. The functional complexity of the Drosophila transcription is worthy of further attentions.

Article activity feed