DeepAllo: allosteric site prediction using protein language model (pLM) with multitask learning

Moaaz Khokhar
Ozlem Keskin
Attila Gursoy

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Motivation

Allostery, the process by which binding at one site perturbs a distant site, is being rendered as a key focus in the field of drug development with its substantial impact on protein function. The identification of allosteric pockets (sites) is a challenging task and several techniques have been developed, including Machine Learning to predict allosteric pockets that utilize both static and pocket features.

Results

Our work, DeepAllo, is the first study that combines fine-tuned protein language model (pLM) with FPocket features and shows an increase in prediction performance of allosteric sites over previous studies. The pLM model was fine-tuned on AlloSteric Database (ASD) in Multitask Learning setting and was further used as a feature extractor to train XGBoost and AutoML models. The best model predicts allosteric pockets with 89.66% F1 score and 90.5% of allosteric pockets in the top 3 positions, outperforming previous results. A case study has been performed on proteins with known allosteric pockets, which shows the proof of our approach. Moreover, an effort was made to explain the pLM by visualizing its attention mechanism among allosteric and non-allosteric residues.

Availability and implementation

The source code is available on GitHub (https://github.com/MoaazK/deepallo) and archived on Zenodo (DOI: 10.5281/zenodo.15255379). The trained model is hosted on Hugging Face (DOI: 10.57967/hf/5198). The dataset used for training and evaluation is archived on Zenodo (DOI: 10.5281/zenodo.15255437).

Version published to 10.1093/bioinformatics/btaf294
May 15, 2025
Version published to 10.1101/2024.10.09.617427v2 on bioRxiv
Feb 7, 2025
Version published to 10.1101/2024.10.09.617427v1 on bioRxiv
Oct 13, 2024

Allosteric Site Prediction Using Protein Language Models and Orthosteric Conditioning

This article has 2 authors:
1. R. C. Eccleston
2. N. Furnham
This article has no evaluationsLatest version Jul 3, 2025
SAIR: Enabling Deep Learning for Protein-Ligand Interactions with a Synthetic Structural Dataset

This article has 14 authors:
1. Pablo Lemos
2. Zane Beckwith
3. Sasaank Bandi
4. Maarten van Damme
5. Jordan Crivelli-Decker
6. Benjamin J. Shields
7. Thomas Merth
8. Punit K. Jha
9. Nicola De Mitri
10. Tiffany J. Callahan
11. AJ Nish
12. Paul Abruzzo
13. Romelia Salomon-Ferrer
14. Martin Ganahl
This article has no evaluationsLatest version Jun 21, 2025
Hybrid Deep Learning with Protein Language Models and Dual-Path Architecture for Predicting IDP Functions

This article has 2 authors:
1. Jiahui Liang
2. Zhenling Peng
This article has no evaluationsLatest version May 28, 2025

Listed in

Abstract

Motivation

Results

Availability and implementation

Article activity feed

Related articles

Allosteric Site Prediction Using Protein Language Models and Orthosteric Conditioning

SAIR: Enabling Deep Learning for Protein-Ligand Interactions with a Synthetic Structural Dataset

Hybrid Deep Learning with Protein Language Models and Dual-Path Architecture for Predicting IDP Functions