Chemical codes promote selective compartmentalization of proteins

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Cells have evolved mechanisms to distribute ∼10 billion protein molecules to subcellular compartments where diverse proteins involved in shared functions must efficiently assemble. Such assembly is presumed to unfold as a result of specific interactions between biomolecules; however, recent evidence suggests that distinctive chemical environments within subcellular compartments may also play an important role. Here, we test the hypothesis that protein groups with shared functions also share codes that guide them to compartment destinations. To test our hypothesis, we developed a transformer large language model, called ProtGPS, that predicts with high performance the compartment localization of human proteins excluded from the training set. We then demonstrate ProtGPS can be used for guided generation of novel protein sequences that selectively assemble into specific compartments in cells. Furthermore, ProtGPS predictions were sensitive to disease-associated mutations that produce changes in protein compartmentalization, suggesting that this type of pathogenic dysfunction can be discovered in silico. Our results indicate that protein sequences contain not only a folding code, but also a previously unrecognized chemical code governing their distribution in specific cellular compartments.

Article activity feed