RESM: Capturing sequence and structure encoding of RNAs by mapped transfer learning from ESM (evolutionary scale modeling) protein language model

Yikun Zhang
Hao Zhang
Guo-Wei Li
He Wang
Xing Zhang
Xu Hong
Tingting Zhang
Liangsheng Wen
Yu Zhao
Jiuhong Jiang
Jie Chen
Yanjun Chen
Liwei Liu
Jian Zhan
Yaoqi Zhou

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

RNA sequences exhibit lower evolutionary conservation than proteins due to their informationally constrained four-letter alphabet, compared to the 20-letter code of proteins. More limited information makes unsupervised learning of structural and functional evolutionary patterns more challenging from single RNA sequences. We overcame this limitation by mapping RNA sequences to pseudo-protein sequences to allow effective transfer training from a protein language model (protein Evolution-Scale Model 2, protESM-2). The resulting RNA ESM (RESM) outperforms 12 existing RNA language models in zero-shot prediction, not only in sequence classification but also in RNA secondary structure and RNA-RNA interaction prediction. Further supervised fine-tuning demonstrates RESM’s generalizability and superior performance over the existing models compared across multiple downstream tasks, including mRNA ribosome loading efficiency and gene expression prediction, despite RESM being trained exclusively on noncoding RNAs. Moreover, RESM can generalize to unseen sequences beyond its 1,024-nucleotide training limit, achieving 81.3% improvement over state-of-the-art methods in supervised secondary structure prediction for RNAs up to 4,000 nucleotides, limited only by the available GPU memory, while providing >1000-fold speedup compared to MSA-based approaches. RESM provides a robust foundation for deciphering RNA sequence-structure-function relationships, with broad implications for RNA biology.

Version published to 10.1101/2025.08.09.669469 on bioRxiv
Aug 10, 2025

Learning the native-like codons with a 5’UTR and secondary RNA structure aided species-informed transformer model

This article has 11 authors:
1. Qiuyue Hu
2. Xiaolin Tian
3. Yuanning Li
4. Rui Zhou
5. Zhihao Wang
6. Jintao Meng
7. Shen Wang
8. Jingjing Guo
9. Weifeng Li
10. Liangzhen Zheng
11. Yanjie Wei
This article has no evaluationsLatest version Jul 19, 2025
A fully-open structure-guided RNA foundation model for robust structural and functional inference

This article has 9 authors:
1. Heqin Zhu
2. Ruifeng Li
3. Feng Zhang
4. Fenghe Tang
5. Tong Ye
6. Xin Li
7. Yunjie Gu
8. Peng Xiong
9. S. Kevin Zhou
This article has no evaluationsLatest version Aug 7, 2025
RNALens: Study on 5’ UTR Modeling and Cell-Specificity

This article has 4 authors:
1. Lei Mao
2. Yuanhe Tian
3. Kang-wei Qian
4. Yan Song
This article has no evaluationsLatest version Jul 20, 2025

Listed in

Abstract

Article activity feed

Related articles

Learning the native-like codons with a 5’UTR and secondary RNA structure aided species-informed transformer model

A fully-open structure-guided RNA foundation model for robust structural and functional inference

RNALens: Study on 5’ UTR Modeling and Cell-Specificity