Unified multimodal learning enables generalized cellular response prediction to diverse perturbations
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cells respond to diverse external interventions through shared regulatory mechanisms, suggesting that diverse interventions may be amenable to unified computational modeling. In practice, however, these responses are profiled across highly heterogeneous experimental settings, including distinct perturbation modalities, dosages, combinations, and cellular contexts. As a result, most computational models remain narrowly tailored to a single perturbation modality or experimental setting. They are difficult to extend to new cell types or perturbation types for which limited training data are available, and offer limited capacity to reuse information across heterogeneous perturbation datasets. Here, we introduce X-Pert, a general perturbation modeling framework that jointly represents external interventions and cellular contexts within a unified multimodal architecture. X-Pert adopts a mechanism-aligned design that explicitly models gene–perturbation interactions together with gene–gene dependencies through dedicated attention mechanisms, enabling the unified handling of heterogeneous experimental settings. Across benchmarks involving both genetic and chemical perturbations, X-Pert consistently outperforms existing methods under conventional accuracy metrics as well as biology-aware evaluations. Importantly, X-Pert exhibits strong generalization across cell types and supports joint learning across perturbation types. The unified latent space learned by X-Pert further enables downstream analyses such as perturbation retrieval and drug–gene association discovery, facilitating the prioritization of candidate gene inhibitors and the identification of anti-cancer compounds. Together, X-Pert establishes a versatile and generalizable foundation for perturbation modeling and predictive virtual cells, with broad applications in biomedical research and therapeutic discovery.