An Isoform-Centric, Structure-Aware Framework for Protein Function Prediction and Evaluation, Instantiated in 3DisoDeepPF

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Understanding and accurately predicting function across protein isoforms has been a long-standing challenge with profound implications for both biological and translational research. However, due to the scarcity of high-quality isoform-resolved annotations, most computational methods for protein function prediction have been developed and benchmarked using a single reference sequence per gene. To address this gap, we present an isoform-centric framework for protein family (Pfam) domain and Gene Ontology (GO) term prediction. We implemented this framework in 3DisoDeepPF, a deep multi-label learning model that integrates a dense graph of sequence and structure similarity with multimodal representations, and applied it to a breast cancer isoform-specific atlas. 3DisoDeepPF improves GO-term and Pfam-domain prediction over all baselines and state-of-the art models in both conventional and isoform-resolved settings. It also remains robust under homology-controlled isoform-level evaluation and resolves directional Pfam remodeling among isoforms from the same gene. An evidence-tracing module links predicted labels to associated proteins, as illustrated by a CIB1 breast cancer isoform case study. Together, 3DisoDeepPF provides a framework for resolving disease-relevant isoform function, and can support hypothesis generation and future prioritization of cancer-associated isoform.

Article activity feed