A multi-modal cell-free RNA language model for liquid biopsy applications

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Cell-free RNA (cfRNA) profiling has emerged as a powerful tool for non-invasive disease detection, but its application is limited by data sparsity and complexity, especially in settings with constrained sample availability. We introduce Exai-1, a multi-modal, transformer-based generative foundation model that integrates RNA sequence embeddings with cfRNA abundance data to capture biologically meaningful representations of circulating RNAs. By leveraging both sequence and expression modalities, Exai-1 captures a biologically meaningful latent structure of cfRNA profiles. Pre-trained on over 306 billion tokens from 8,339 samples, Exai-1 enhances signal fidelity, reduces technical noise, and improves disease detection by generating synthetic cfRNA profiles. We show that self-attention and variational inference are particularly important for preservation of biological signals and contextual relationships. Additionally, Exai-1 facilitates cross-biofluid translation and assay compatibility through disentangling biological signals from confounders. By uniting sequence-informed embeddings with cfRNA expression patterns, Exai-1 establishes a transfer learning foundation for liquid biopsy, offering a scalable and adaptable framework for next-generation cfRNA-based diagnostics.

Article activity feed