AstraBIND: Graph Attention Network for Predicting Ligand Binding Sites

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Predicting ligand binding sites is central to computational biology and drug discovery. Existing machine learning approaches either use protein sequence, structure, or both. While structure-based deep learning models typically outperform sequence-based methods, they often require high computational cost or ligand-specific data, forcing a trade-off between accuracy and scalability.

We present AstraBIND, a lightweight graph neural network that bridges this gap by integrating protein sequence, structure (experimental or predicted), and homology information to predict ligand classes and binding residues within minutes. The model employs a GATv2 architecture with 0.9 M parameters, trained on ¿250 000 curated protein–ligand complexes across 16 ligand categories. By encoding residue-level features and spatial geometry through graph attention, AstraBIND identifies binding residues and ligand types while maintaining structural consistency.

In benchmarking, AstraBIND achieved a weighted macro-F1 of 0.47 across all ligand classes, with top performance for nucleotides (F1 = 0.79), porphyrins (0.74), and cofactors (0.73). Case studies, including p53 and CRFR1, demonstrate robust pocket localization for diverse proteins. Combined with its minimal inference time and broad ligand coverage, AstraBIND enables rapid in-silico screening and integration into laboratory workflows. Together with other Astra ML models (1; 2), it represents a step toward real-time protein design and validation pipelines.

Astra models are available at https://www.orbion.life .

Article activity feed