Multiple human enhancer RNAs contain long translated open reading frames
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Enhancer RNAs (eRNAs) are transcribed by RNA polymerase II during enhancer activation but are typically rapidly degraded in the nucleus. During states of reduced RNA surveillance, however, eRNAs and other similar “noncoding” RNAs (including, e.g., upstream antisense RNAs) are stabilized, and some are exported to the cytoplasm and can even be found on polysomes. Here, we report unexpectedly that ∼12% of human intergenic eRNAs contain long open reading frames (>300 nt), many of which can be actively translated, as determined by ribosome profiling, and produce proteins that accumulate in cells, as shown by mass spectrometry (MS) data. Focusing on the largest of the encoded proteins, which we designated as eORFs, which can be up to ∼45 kDa, we found, remarkably, that most are highly basic, with pIs >11.5. This unusual chemistry reflects a striking overabundance of arginine residues and occurs despite a relative paucity of lysines. Exogenous expression of the 10 largest eORFs revealed that they accumulate stably in cells as full-length proteins, and most localize to the nucleus and associate with chromatin. Identification of interacting proteins by MS suggested possible roles for these proteins in several nuclear processes. The eORFs studied are well conserved among primates, though they are largely absent from other mammals. Notably, several contain human-specific C-terminal extensions and display properties suggestive of de novo gene birth. In summary, we have discovered that a fraction of human eRNAs can function as mRNAs, revealing a new and unexpected role for these transcripts.