Towards a comprehensive view of the pocketome universe—biological implications and algorithmic challenges

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

With the availability of reliably predicted 3D-structures for essentially all known proteins, characterizing the entirety of compound-binding sites (binding pockets on proteins) has become a possibility. The aim of this study was to identify and analyze all compound-binding sites, i.e., the pocketomes, of eleven species from different kingdoms of life to discern evolutionary trends as well as to arrive at a global cross-species view of the pocketome universe. Computational binding site prediction was performed on all protein structures in each species as available from the AlphaFold database. The resulting set of potential binding sites was inspected for overlaps with known pockets and annotated with regard to the protein domains in which they are located. 2D-projection plots of all pockets embedded in a 128-dimensional feature space, and characterizing them with regard to selected physicochemical properties, provide informative, global pocketome maps that unveil differentiating features between pockets. Our study revealed a sub-linear scaling law of the number of unique binding sites relative to the number of unique protein structures per species. Thus, as proteomes increased in size during evolution and therefore potentially diversified, the number of distinct binding sites, reflecting potentially diversifying functions, grew less than proportionally. We discuss the biological significance of this finding as well as identify critical and unmet algorithmic challenges.

Article activity feed