Unlocking diverse molecular binding modes using latent feature sampling
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Deep learning has revolutionized the prediction of biomolecular structures, yet accurately identifying diverse intermolecular binding sites remains a persistent challenge. Current state-of-the-art models often stagnate in dominant local minima determined by co-evolutionary biases, failing to sample biologically critical alternative conformations. Here we present DynaHelix, a framework that actively steers the generative process to unlock diverse binding modes encoded within the latent space of structure prediction models. Unlike methods relying on stochastic perturbations of multiple sequence alignments, DynaHelix analyzes intermediate pairwise feature to identify potential interaction hotspots. These are converted into targeted geometric constraints that drive the diffusion model to explore specific, under-represented interfaces. We demonstrate that DynaHelix significantly outperforms leading open-source models including Protenix, Chai-1, and Boltz-1 across comprehensive benchmarks FoldBench and challenging targets from CASP15 and CASP16. Notably, our approach achieves superior structural accuracy for complex antigen–antibody and protein–DNA interactions while reducing the computational sampling budget by an order of magnitude. By enabling deterministic control over the sampling landscape, DynaHelix offers a generalizable pathway for resolving high-ambiguity molecular interactions essential for mechanistic biology and drug discovery.