Disentangling Protein Function via Decoupled Information Theoretic Selection of Key Tuning Residues

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Rational protein engineering requires identifying residues that modulate function without disrupting functionality, a key challenge in protein engineering. Existing computational methods struggle to distinguish genuine functional sites from positions coevolving due to structural constraints, leading to high false-discovery rates. Here we present an information-theoretic decoupling framework that, without machine learning, isolates key tuning residues by computationally “denoising” sequence data, iteratively removing confounding evolutionary signals to reveal underlying functional sites. We validated this framework across 10 datasets spanning enzymes, fluorescent proteins, and antibodies. In a nanobody-antigen binding case study, our method identified > 25% (6/20) of verified binding-critical residues ( p = 0.031), while the best of five benchmarked tools found zero. Performance was consistent across all datasets, with supervised variants achieving large effect sizes (Hedges’ g > 0.7, p < 0.01) and unsupervised variants also showing gains ( g > 0.2, p < 0.05) over benchmarks. This interpretable framework provides a generalizable method to accelerate protein design, from focusing antibody maturation to optimizing biocatalysts.

Article activity feed