Discovery of evolutionarily extended cis -regulatory overlapping genes expanding the protein universe from animals to humans

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

A recent proteogenomics-driven approach has uncovered actively translated alternative open reading frames (altORFs) that complement reference protein-coding sequences (refCDSs). Here, we further explored proteogenomic data to identify hidden proteomes, specifically short ORF-encoded polypeptides (SEPs) and longer altORF-encoded proteins (LEPs) previously overlooked in human genomes. We discovered a novel class of proteomes originating from SEP/LEP-coding upstream overlapping altORFs (oORFs) associated with refCDSs, which cis -regulate refCDS translation and have undergone evolutionary selection favoring C-terminal extension for the enhanced cis -regulation. These translatable oORFs occur in signal effectors of the Hippo-YAP/TAZ, p53, Wnt, and TGF-β crosstalk pathways and frequently arise from intragenic frameshift polymorphisms closely linked to human diseases. The intragenic frameshift mutations divide refCDSs and capture the N-terminal regions to form the new oORFs. Consequently, we termed these entities upstream region-usurping repurposed proteins (USURPs). These findings offer new insights into birth and evolution of proteomes, broadening our understanding of the protein universe influenced by genome dynamics.

Article activity feed