Beyond the Leaderboard: Leveraging Predictive Modeling for Protein-Ligand Insights and Discovery
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
Ligands are biomolecules that bind to specific sites on target proteins, often inducing conformational changes important in the protein’s function. Knowledge about ligand interactions with proteins is fundamental to understanding biological mechanisms and advancing drug discovery. Traditional protein language models focus on amino acid sequences and three-dimensional structures, overlooking the structural and functional changes induced by protein-ligand interactions. We investigate the value of integrating ligand-protein binding data in several predictive challenges and leverage findings to frame research directions and questions.
Results
We show how the integration of protein-ligand interaction data in protein representation learning can increase predictive power. We evaluate the methodology across diverse biological tasks, demonstrating consistent improvements over state-of-the-art models. We further demonstrate how the study of the specific boosts in predictive capabilities coming with the introduction of the lig- and modality can serve to focus attention and provide insights on biological mechanisms. By leveraging large pretrained protein language models and enriching them with interaction-specific features through a tailored learning process, we capture functional and structural nuances of proteins in their biochemical context.
Availability and implementation
The full code and data are freely available at https://github.com/kalifadan/ProtLigand .
Contact
kalifadan@cs.technion.ac.il