Mechanistic evidence that motif-gated domain recognition drives contact prediction in protein language models

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Protein language models (pLMs) achieve state-of-the-art performance on protein structure and function prediction tasks, yet their internal computations re-main opaque. Sparse autoencoders (SAEs) have been used to recover sparse features, called latents, from pLM layer representations, whose activations cor-relate with known biological concepts. However, prior work has not established which model concepts are causally necessary for pLM performance on down-stream tasks. Here, we adapt causal activation patching to the pLM setting and perform it in SAE latent space to extract the minimal circuit responsible for accuracy in a contact prediction task for two case study proteins. We observe that preserving only a tiny fraction of latent–token pairs (0.022% and 0.015%) is sufficient to retain contact prediction accuracy in a residue unmasking experiment. Our circuit indicates a two-step computation in which early-layer motif detectors respond to short local sequence patterns, gating mid-to-late domain detectors which are selective for protein domains and families. Path-level ab-lations confirm the causal dependence of domain latents on upstream motif latents. To evaluate these components quantitatively, we introduce two diagnostics: a Motif Conservation Test and a Domain Selectivity Framework that supports hypothesis-driven tests. All candidate motif-detector latents pass the conservation test, and 18/23 candidate domain-detector latents achieve AUROC ≥0.95. To our knowledge, this is the first circuits-style causal analysis for pLMs, pin-pointing the motifs, domains, and motif-domain interactions that drive contact prediction in two specific case studies. The framework introduced herein will enable future mechanistic dissection of protein language models. Code available at https://github.com/NainaniJatinZ/plm_circuits

Article activity feed