Adversarial Sequence Mutations in AlphaFold and ESMFold Reveal Nonphysical Structural Invariance, Confidence Failures, and Concerns for Protein Design
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (Arcadia Science)
Abstract
AlphaFold has transformed structural biology and spawned an ecosystem of derivative tools for protein design, binding prediction, and drug discovery. However, whether AlphaFold has learned generalizable biophysical principles versus template-based pattern matching remains unclear—a distinction critical for applications beyond its training context. Here, we perform a systematic adversarial evaluation of AlphaFold 3 using point and deletion mutations across 200 proteins. Remarkably, predicted structures remain invariant to mutations of up to 40% of residues—including deliberately destabilizing substitutions—and to deletions of 10%. Notably, this invariance holds even for experimentally validated fold-switching proteins that are known to adopt alternative conformations in response to such mutations, despite the fact that these proteins are small and monomeric—precisely the category where AlphaFold is expected to perform best. Confidence metrics prove unreliable, as they select the most accurate structure at most 35% of the time and correlate with the structural quality of the best available training set template. This suggests that AlphaFold’s uncertainty estimates reflect template availability more than biophysical reasoning. ESMFold exhibits greater, though still imperfect, mutational sensitivity, suggesting superior sequence-structure coupling. These findings indicate that AlphaFold may rely heavily on memorized templates rather than biophysical reasoning, with profound implications for the reliability of AlphaFold-based protein design, drug discovery, and modeling workflows.
Article activity feed
-
Under deletion mutations, ESMFold and AlphaFold 3 perform comparably, with ESMFold exhibiting consistently greater flexibility—that is, lower structural similarity to its original prediction—across all deletion thresholds. This suggests that ESMFold’s predictions are slightly more sensitive to the disruptive effects of residue removal.Whether they yield experimentally correct structures is still an open question.
This is exciting to see here - it mirrors our experience where putatively erroneous dimeric structures are inferred by Alphafold, but ESMFold will not create a contiguous protein from the same input sequences.
-
This result reinforces the conclusion that AlphaFold 3 frequently fails to capture biologically plausible mutational responses, even when multiple conformations are known to exist. The AlphaFold 3 ranking score similarly remains high up to the 40% threshold, further underscoring confidence metric invariance. Hence, even for the small proteins where fold-switching occurs, AlphaFold 3 does not reliably respond to known fold-switching mutations, suggesting it should be used with caution in protein sequence optimization.
Thanks for this interesting paper - I wondered whether, for these fold-switching proteins, both conformations are present in the AlphaFold training set or only one. I'd be especially curious about how much perturbation is required to force fold switching when both conformations are present.
-