Hybrid Generative Model: Bridging Machine Learning and Biophysics to Expand RNA Functional Diversity

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Functional RNAs perform diverse catalytic roles, yet natural sequences represent only a narrow subset of what is possible. Rediscovering such activities requires exploring functional sequence diversity beyond natural RNAs. We introduce a hybrid generative model that combines a coevolutionary likelihood with an RNA secondary structure prior. This approach disentangles folding constraints from functional signals, enabling targeted diversification. On synthetic benchmarks, the model generates functional sequences beyond the training distribution. On large-scale ribozyme data, it improves the detection of active sequences and enhances sensitivity to local tertiary contacts. Finally, we introduce structural imprinting, a sampling strategy that uses alternative secondary structures to steer generation across under-sampled regions of sequence space. These results show that folding-informed generative modeling improves RNA design by supporting both extrapolation and control.

Article activity feed