Integrating protein sequence embeddings with structure via graph-based deep learning for single-residue property prediction

Kevin Michalewicz
Mauricio Barahona
Barbara Bravi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Understanding the intertwined contributions of amino acid sequence and spatial structure is essential to explain protein behaviour. Here, we introduce INFUSSE (Integrated Network Framework Unifying Structure and Sequence Embeddings), a deep learning framework for the prediction of single-residue properties that combines fine-tuning of sequence embeddings derived from a Large Language Model with the inclusion of graph-based representations of protein structures via a diffusive Graph Convolutional Network. To illustrate the benefits of jointly leveraging sequence and structure, we apply INFUSSE to the prediction of B-factors in antibodies, a residue property that reflects the local flexibility shaped by biochemical and structural constraints in these highly variable and dynamic proteins. Using a dataset of 1510 antibody and antibody-antigen complexes from the database SAbDab, we show that INFUSSE improves performance over current machine learning (ML) methods based on sequence or structure alone, and allows for the systematic disentanglement of sequence and structure contributions to the performance. Our results show that adding structural information via geometric graphs enhances predictions especially for intrinsically disordered regions, protein-protein interaction sites, and highly variable amino acid positions---all key structural features for antibody function which are not well captured by purely sequence-based ML descriptions.

Version published to 10.21203/rs.3.rs-8043216/v1 on Research Square
Nov 7, 2025

Enhancing molecular property prediction via transformer with dual graph representation

This article has 2 authors:
1. Shuyuan Zhang
2. Alexei Lapkin
This article has no evaluationsLatest version Dec 9, 2025
Quantum-Assisted Refinement of AlphaFold Protein Structures

This article has 1 author:
1. Parham Ghayour
This article has no evaluationsLatest version Dec 31, 2025
Emergence of Biological Structural Discovery in General-Purpose Language Models

This article has 1 author:
1. Liang Wang
This article has no evaluationsLatest version Jan 8, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Enhancing molecular property prediction via transformer with dual graph representation

Quantum-Assisted Refinement of AlphaFold Protein Structures

Emergence of Biological Structural Discovery in General-Purpose Language Models