A unified multimodal model for generalizable zero-shot and supervised protein function prediction

Frimpong Boadu
Yanli Wang
Jianlin Cheng

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Predicting protein function is a fundamental yet challenging task that requires integrating diverse biological data modalities to capture complex functional relationships. Traditional machine learning methods often rely on single modalities or combine only a limited number (typically two), without aligning them in a unified representation, thereby constraining predictive accuracy. Moreover, most existing approaches are limited to preselected subsets of Gene Ontology (GO) function terms with sufficient annotations, making the prediction of novel function terms a persistent challenge. Here, we present FunBind, a multimodal AI model that jointly learns from protein sequences, textual descriptions, domain annotations, structural features, and GO terms to enhance prediction accuracy and infer previously unseen functions. FunBind operates in two modes: (1) self-supervised pretraining using contrastive learning to align the sequence modality with other heterogeneous modalities in a unified latent space, enabling unsupervised zero-shot function prediction, and (2) supervised fine-tuning that leverages all modalities for comprehensive and accurate function classification. Our results show that FunBind’s zero-shot capabilities allow it to generalize effectively to novel function terms never encountered before, while its joint fine-tuning strategy substantially outperforms single-modality models and current state-of-the-art approaches in prediction accuracy.

Version published to 10.1101/2025.05.09.653226v1 on bioRxiv
May 14, 2025

A Benchmarking Platform for Assessing Protein Language Models on Function-related Prediction Tasks

This article has 5 authors:
1. Elif Çevrim
2. Melih Gökay Yiğit
3. Erva Ulusoy
4. Ardan Yılmaz
5. Tunca Doğan
This article has no evaluationsLatest version Apr 16, 2025
Enhancing Structure-aware Protein Language Models with Efficient Fine-tuning for Various Protein Prediction Tasks

This article has 6 authors:
1. Yichuan Zhang
2. Yongfang Qin
3. Mahdi Pourmirzaei
4. Qing Shao
5. Duolin Wang
6. Dong Xu
This article has no evaluationsLatest version Apr 26, 2025
Functional alignment of protein language models via reinforcement learning

This article has 6 authors:
1. Nathaniel Blalock
2. Srinath Seshadri
3. Agrim Babbar
4. Sarah A Fahlberg
5. Ameya Kulkarni
6. Philip A Romero
This article has no evaluationsLatest version May 8, 2025

Listed in

Abstract

Article activity feed

Related articles

A Benchmarking Platform for Assessing Protein Language Models on Function-related Prediction Tasks

Enhancing Structure-aware Protein Language Models with Efficient Fine-tuning for Various Protein Prediction Tasks

Functional alignment of protein language models via reinforcement learning