A perturbation proteomics-based foundation model for virtual cell construction

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Building a virtual cell requires comprehensive understanding of protein network dynamics of a cell which necessitates large-scale perturbation proteome data and intelligent computational models learned from the proteome data corpus. Here, we generate a large-scale dataset of over 38 million perturbed protein measurements in breast cancer cell lines and develop a neural ordinary differential equation-based foundation model, namely ProteinTalks. During pretraining, ProteinTalks gains a fundamental understanding of cellular protein network dynamics. Our model encodes protein networks and exhibits consistently improved predictive accuracy across various downstream tasks, highlighting its generalization capabilities and adaptability. In cancer cells, ProteinTalks robustly predicts drug efficacy and synergy, identifies novel drug combinations, and, through its interpretability, uncovers resistance-associated proteins. When applied to more complex system, patient-derived tumor xenografts, ProteinTalks predicts potential responses to drugs. Its integration with clinical patient data enhances the prognosis prediction of breast cancer patients. Collectively, we present a foundational model based on proteome dynamics, offering potential for various downstream applications, including drug discovery, and providing a basis for developing virtual cells.

Article activity feed