A foundation language model to decipher diverse regulation of RNAs

Hanwen Zhou
Yue Hu
Yulong Zheng
Jiefu Li
Jielong Peng
Jiang Hu
Yun Yang
Guoqing Zhang
Zefeng Wang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

RNA metabolism is tightly regulated by cis -elements and trans -acting factors. Most information guiding such regulation is encoded in RNA sequences. Considering the similarities in semantic and syntactic features between RNAs and human language, we developed LAMAR, a transformer-based foundation la nguage m odel for RN A r egulation, to decipher general rules underlying RNA processing. The model was pretrained on approximately 15 million sequences from both genome and transcriptome of 225 mammals and 1569 viruses, and further fine-tuned with labeled datasets for various tasks. The resulting fine-tuned models outperformed the state-of-the-art methods in predicting mRNA translation efficiency and mRNA half-life, while achieving comparable accuracy to specifically designed methods in predicting splice sites of pre-mRNAs and internal ribosome entry sites. Our results indicated that a single foundation language model is applicable in the comprehensive analysis of different aspects of RNA regulation, providing new insight into the design and optimization of RNA drugs.

Version published to 10.1101/2024.10.12.617732 on bioRxiv
Oct 13, 2024

Non-Coding RNA: Architects of Cellular Complexity and Agents of Malignancy

This article has 1 author:
1. Amil Shah
This article has no evaluationsLatest version Jan 28, 2026
Benchmarking Reveals the Superiority of Nucleic Acid Foundation Models in Predicting lncRNA Coding Potential

This article has 5 authors:
1. Yu Yang
2. Liping Ren
3. Juan Feng
4. Yang Zhang
5. Tianyuan Liu
This article has no evaluationsLatest version Dec 17, 2025
A retroelement-derived mammalian ARC protein exhibits selective RNA recognition and nucleic acid chaperone functions

This article has 6 authors:
1. Julita Gumna-Mikina
2. Angelika Andrzejewska-Romanowska
3. Maciej Antczak
4. Ewa Tykwińska
5. Marta Szachniuk
6. Katarzyna Pachulska-Wieczorek
This article has no evaluationsLatest version Jan 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Non-Coding RNA: Architects of Cellular Complexity and Agents of Malignancy

Benchmarking Reveals the Superiority of Nucleic Acid Foundation Models in Predicting lncRNA Coding Potential

A retroelement-derived mammalian ARC protein exhibits selective RNA recognition and nucleic acid chaperone functions