Hybrid Gated Fusion: A Multimodal Deep Learning Framework for Protein Function Annotation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Protein function annotation requires integrating diverse biological signals, yet existing multimodal methods often struggle with missing inputs and redundant information. We present Hybrid Gated Fusion, a multimodal architecture that combines intrinsic protein features, including sequence and structure, with extrinsic functional context from text and interaction networks. Rather than weighting all modalities equally, the model uses bilinear gating to assess both the informativeness of each modality and its agreement with the others, while auxiliary supervision reduces modality dominance and preserves useful signal in weaker modalities. On the CAFA3 benchmark, a single Hybrid Gated Fusion model achieves state-of-the-art performance in Biological Process ( F max = 0.601) and Cellular Component ( F max = 0.706), while remaining competitive in Molecular Function ( F max = 0.702). Analysis of the learned gates shows that interaction networks and text often provide complementary functional signals, whereas structural features are down-weighted when redundant but remain valuable under sparse-input settings. These results establish Hybrid Gated Fusion as a robust and scalable framework for genome-scale protein function annotation.

Availability and implementation

Source code and reproduction scripts are freely available at https://github.com/psipred/PFP . Pre-computed embeddings, data splits, and model checkpoints are deposited at https://doi.org/10.5281/zenodo.19498341 .

Article activity feed