Evolutionary and geometric signatures reveal ligand-binding sites across proteomes

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Identifying protein binding sites is central to drug discovery, yet many computational approaches still trade off precision, recall, or throughput when scaled. We introduce PickPocket, a deep learning model that fuses evolutionary information from protein sequences with geometric representations of structure to identify ligand-binding residues at proteome scale. By leveraging complementary sequence context and spatial neighborhoods, PickPocket generalizes across diverse protein families and ligand chemistries while operating at a recall-oriented setting with competitive precision. In benchmark evaluations it delivers strong residue-level recovery and, despite no explicit training on conformational switching, reliably identifies cryptic pockets in held-out structures, comparing favorably with specialized approaches. Applied across 356,711 proteins, the method nominates previously unannotated candidate sites enriched for functional signals and highlights tractable surface chemistry on therapeutically relevant targets. These results position evolutionary-geometric fusion as a practical foundation for large-scale site mapping that can shorten the path from structure to experiment and support hit discovery, mutagenesis design, and target assessment.

Article activity feed