Deep Learning-Based Prediction and Suppression of Protein Aggregation- Prone Regions

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Identification of aggregation-prone regions in proteins and their suppression through mutations is a powerful strategy to enhance protein solubility and yield, significantly expanding their application potential. Here, we developed a deep neural network-based predictor AggreProt, that generates a residue-level aggregation profile for protein sequences. The model outperformed or matched current state-of-the-art algorithms, as validated on two independent datasets comprising hexapeptides and full-length proteins with annotated aggregation-prone regions. We further validated the model experimentally using a set of 34 hexapeptides identified in the model protein haloalkane dehalogenase LinB, along with seven proteins from the AmyPro database. Experimental results agreed with our predictions in 79% of cases and also revealed inaccuracies in some database annotations. Finally, the algorithm’s utility was demonstrated by identifying aggregation-prone regions in the LinB enzyme and designing mutations to suppress aggregation in its exposed regions. The resulting variants exhibited reduced aggregation propensity, improved solubility, and up to a 100% increase in yield compared to the wild type. AggreProt is freely available to the scientific community via a user- friendly web server: https://loschmidt.chemi.muni.cz/aggreprot .

Article activity feed