Sequence-based Drug-Target Complex Pre-training Enhances Protein-Ligand Binding Process Predictions Tackling Crypticity
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Predicting protein-ligand binding processes, such as affinity and kinetics, is critical for accelerating drug discovery. However, many existing computational methods face key limitations, including insufficient integration of comprehensive databases, inadequate representation of protein structural dynamics, and incomplete modeling of microscale protein-ligand interactions. To address these challenges, we introduce ProMoNet, a sequence-based pre-training and fine-tuning framework to enhance protein-ligand binding process prediction. ProMoNet connects protein and molecular foundation models to expand data coverage and enhance diversity, and it integrates large-scale binding site pre-training with efficient fine-tuning for affinity and kinetics prediction. During pre-training, it effectively models microscale protein-ligand interactions and captures the dynamic nature of proteins, including binding site crypticity, without relying on 3-dimensional structural inputs. Notably, ProMoNet’s pre-training module surpasses or matches state-of-the-art structure-based methods in identifying exposed and cryptic binding sites. In the fine-tuning stage, it transfers pre-trained knowledge, achieving superior performance in affinity and kinetics prediction tasks with high computational efficiency. The combination of ProMoNet’s powerful modeling capabilities and demonstrated success across multiple tasks highlight its potential for broad applications in drug discovery.