More than 100 dual coding regions have evidence for selection constraints in both reading frames

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Alternative splicing can generate multiple differently spliced transcripts from a single pre-mRNA. A striking number of genes have alternative splice events that ccan hange the downstream reading frame leading to exons that code from distinct reading frames. In fact, more than a third of the coding genes in the human gene set are annotated with dual coding exons derived from alternative splicing events.

Here we analysed a set of 537 dual coding regions that have evidence to support their functional importance. These dual coding regions produce protein isoforms with completely different C-terminals and have reading frames that are supported by either peptide or conservation evidence. More than a quarter of the alternative reading frames are preserved across all mammals, and many can be traced back to the earliest jawed vertebrates. Most of these ancient dual coding regions appear to be under selective constraints. We find support for purifying selection on both frames in 105 pairs of transcripts and two genes, CCSER2 and SH2B1 , have triple coding regions that are under clear selection pressure in all three frames.

We found evidence to suggest that many ancient dual coding regions may have played important roles in the evolution of the vertebrate central nervous system. Most ancient dual coding regions with evidence for protein level tissue specificity were brain specific and we showed that genes with ancient dual coding regions are highly enriched in brain tissues. Most remarkably, we found that more than 80% of the genes with these ancient dual coding regions are implicated in neuron development, synapses and neural cell projections.

Article activity feed