Deep Learning-Based Prediction and Suppression of Protein Aggregation- Prone Regions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Identification of aggregation-prone regions in proteins and their suppression through mutations is a powerful strategy to enhance protein solubility and yield, significantly expanding their application potential. Here, we developed a deep neural network-based predictor AggreProt, that generates a residue-level aggregation profile for protein sequences. The model outperformed or matched current state-of-the-art algorithms, as validated on two independent datasets comprising hexapeptides and full-length proteins with annotated aggregation-prone regions. We further validated the model experimentally using a set of 34 hexapeptides identified in the model protein haloalkane dehalogenase LinB, along with seven proteins from the AmyPro database. Experimental results agreed with our predictions in 79% of cases and also revealed inaccuracies in some database annotations. Finally, the algorithm’s utility was demonstrated by identifying aggregation-prone regions in the LinB enzyme and designing mutations to suppress aggregation in its exposed regions. The resulting variants exhibited reduced aggregation propensity, improved solubility, and up to a 100% increase in yield compared to the wild type. AggreProt is freely available to the scientific community via a user- friendly web server: https://loschmidt.chemi.muni.cz/aggreprot .