RNAGym: Large-scale Benchmarks for RNA Fitness and Structure Prediction

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Understanding RNA structure and predicting the functional consequences of mutations are fundamental challenges in computational biology with broad implications for therapeutic development and synthetic biology. Current evaluation of machine learning-based RNA models suffers from disparate experimental datasets and inconsistent performance assessments across different RNA families. To address these challenges, we introduce RNAGym, a large-scale benchmarking framework specifically designed for three core tasks–RNA fitness, secondary structure, and tertiary structure prediction. The framework integrates extensive datasets, including 70 standardized deep mutational scanning assays covering over a million mutations across diverse RNA types; 901k chemical-mapping reactivity profiles for secondary structure; and 215 diverse tertiary structures curated from the PDB. RNAGym is designed to facilitate a systematic comparison of RNA models, offering an essential resource to enhance the understanding and development of these models.

Article activity feed