Large scale analysis of predicted protein structures links model features to in vivo behaviour

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Rapid advancements in protein structure prediction methods have ushered in a new era of abundant and accurate structural data, providing opportunities to analyse proteins at a scale that has not been possible before. Here we show that features derived solely from predicted structures can be used to understand in vivo protein behaviour using data-driven methods. We found that these features were predictive of in vivo protein production for a set of designed antibodies, enabling identification of high-quality designs. Following on from this result, we calculated these features for a diverse set of ≈500,000 predicted structures, and our analysis showed systematic variation between proteins from different organisms to such an extent that the tree of life could be recapitulated from these data. Given the high degree of functional constraint around the chemistry of proteins, this result is surprising, and could have important implications for the design and engineering of novel proteins.

Article activity feed