Beyond the Leaderboard: Leveraging Predictive Modeling for Protein-Ligand Insights and Discovery

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Ligands are biomolecules that bind to specific sites on target proteins, often inducing conformational changes important in the protein’s function. Knowledge about ligand interactions with proteins is fundamental to understanding biological mechanisms and advancing drug discovery. Traditional protein language models focus on amino acid sequences and three-dimensional structures, overlooking the structural and functional changes induced by protein-ligand interactions. We investigate the value of integrating ligand-protein binding data in several predictive challenges and leverage findings to frame research directions and questions.

Results

We show how the integration of protein-ligand interaction data in protein representation learning can increase predictive power. We evaluate the methodology across diverse biological tasks, demonstrating consistent improvements over state-of-the-art models. We further demonstrate how the study of the specific boosts in predictive capabilities coming with the introduction of the lig- and modality can serve to focus attention and provide insights on biological mechanisms. By leveraging large pretrained protein language models and enriching them with interaction-specific features through a tailored learning process, we capture functional and structural nuances of proteins in their biochemical context.

Availability and implementation

The full code and data are freely available at https://github.com/kalifadan/ProtLigand .

Contact

kalifadan@cs.technion.ac.il

Article activity feed