Signatures of Micropeptides Encoded by lncRNAs in Cancer Progression and Metastasis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Long non-coding RNAs (lncRNAs) are key regulators of gene expression, chromatin remodeling, and signaling. Recent estimates suggest that the human genome contains more than 35,000 lncRNA genes, with roughly 20% predicted to encode micropeptides (MPs) with unknown functions. In this study, we focused on the subset of lncRNAs with strong statistical evidence for MP-encoding potential, accounting for approximately 8% of the unfiltered MPs collection. Our analysis centered on 1,782 high-confidence lncRNA-MPs derived from 478 genes expressed across 17 cancer types from The Cancer Genome Atlas (TCGA). We show that lncRNA-MPs display distinct amino acid compositions and unique 4-mer patterns compared to the human coding proteome. A few genes (9) with exceptionally long transcripts are characterized by ≥20 MPs each. Functional interference confirmed that most of the lncRNA-MPs are unstructured. Only a third of the genes display some phylogenetic conservation, and only 4 genes display canonical N-terminal signal peptides characteristic of secreted proteins. We focused on cancer progression-associated lncRNAs that show differential expression (z-score >|3|) across consecutive tumor stages and metastatic states (transitional lncRNAs, Tr-lncRNAs). A collection of 72 genes encoding 314 MPs (Tr-lncRNA-MPs) was detected, with 76% of the MPs being ≥30 amino acids long. Prediction by AlphaFold 2.0 and homology modeling tools revealed dozens of MPs with well-defined secondary structures and recognizable 3D motifs. Among the longer Tr-lncRNA-MPs (>60 amino acids), we confirmed the presence of ubiquitin-like, RNase H-related, and other conserved foldable motifs. Known cancer lncRNAs containing high-confidence MPs (XIST, UCA1, HOXA11-AS, LINC01234, and HAND-AS1) overlap with 50 pan-cancer lncRNAs associated with tumor stage or metastasis transitions. Together, these findings demonstrate that integrating sequence motifs (e.g., signal peptides, k-mers) with structural foldability offers a multifaceted view of lncRNA-MPs in cancer. We argue that the capacity to produce MPs may reinforce the oncogenic impact dominated by the lncRNA entity. We propose that Tr-lncRNA-MPs represent a promising new class of biomarkers and therapeutic targets in oncology.
Key points
-
478 lncRNA genes with strong evidence for micropeptide (MPs) production generated 1,782 distinct lncRNA-MPs.
-
72 lncRNAs and 314 MPs are associated with transitional lncRNAs from 17 cancer types and stages of tumor progression and metastasis.
-
Sequence and structural analyses reveal many MPs with reliable 3D folding potential.
-
Dozens of previously overlooked MPs may serve as novel biomarkers and therapeutic targets in cancer.