Discovery of evolutionarily extended cis -regulatory overlapping genes expanding the protein universe from animals to humans
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
A recent proteogenomics-driven approach has uncovered actively translated alternative open reading frames (altORFs) that complement reference protein-coding sequences (refCDSs). Here, we further explored proteogenomic data to identify hidden proteomes, specifically short ORF-encoded polypeptides (SEPs) and longer altORF-encoded proteins (LEPs) previously overlooked in human genomes. We discovered a novel class of proteomes originating from SEP/LEP-coding upstream overlapping altORFs (oORFs) associated with refCDSs, which cis -regulate refCDS translation and have undergone evolutionary selection favoring C-terminal extension for the enhanced cis -regulation. These translatable oORFs occur in signal effectors of the Hippo-YAP/TAZ, p53, Wnt, and TGF-β crosstalk pathways and frequently arise from intragenic frameshift polymorphisms closely linked to human diseases. The intragenic frameshift mutations divide refCDSs and capture the N-terminal regions to form the new oORFs. Consequently, we termed these entities upstream region-usurping repurposed proteins (USURPs). These findings offer new insights into birth and evolution of proteomes, broadening our understanding of the protein universe influenced by genome dynamics.