BiLSTM-Powered Bilinear Attention for Protein–Ligand Prediction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Rapid and accurate prediction of protein-ligand bindings is essential for drug discovery. While generative AI has driven rapid advancements in structure-based approaches, sequence-based methods remain significantly faster and more cost-effective. Here, we present a weakly supervised deep learning framework integrating graph convolutional networks (GCN) for molecular encoding and bidirectional long short-term memory (BiLSTM) for protein modeling. The latter represents long-range dependencies better than the widely used convolutional neural network (CNN). Leveraging a bilinear attention network (BAN), this model learns protein-ligand pairwise interactions without requiring three-dimensional structural supervision. By using the publicly available BindingDB dataset, the model was trained, solely on affinity labels, and successfully classified binder and non-binders with AUROC of 0.96 and an AUPRC of 0.95. The model generates interpretable attention maps that serve as a “GPS” to locate binding sites. Remarkably, despite the lack of structural training data, it can pinpoint key contact residues confirmed by crystal structures. Our method could function as a scalable filter for giga-scale libraries, allowing rapid screening of drug candidates with direct structural insights into the protein-ligand interface.

Article activity feed