Hidden State Genomics: Graph-Based Analysis of Sparse Auto-Encoder Feature Activity in Genomic Language Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Pre-trained genomic language model (gLM) representations have been anticipated to enable enhanced deep learning predictions on several genomics tasks, but current benchmarking has led to questions over what they actually encode. We studied this with mechanistic interpretability on InstaDeep’s Nucleotide Transformer v2 (500M), training sparse autoencoders across all 24 encoder layers to probe latent features. Correlation-based annotation against reference regulatory tracks was inconsistent across layers and insufficient for causal interpretation. We therefore built typed sequence-to-feature knowledge graphs to explore the SAE feature space and compared cisplatin-binding versus non-binding genomic DNA sequence communities by PageRank centrality, validating candidate features with decoder-based interventions and a CNN binding classifier. Interventions showed asymmetric effects: suppressive features could collapse predictive signal, while binding-associated features shifted predictions cumulatively with the presence of other binding-associated signals. Dependency maps further indicated strong local feature sensitivity within sequences. Together, these results provide evidence that gLM representations encode highly granular sequence syntax and conservation patterns, aligning more strongly with tightly coupled molecular interactions and local biophysical constraints than with complex, distributed regulatory logic. Within the scope of our intervention setting, this pattern is consistent with stronger performance on selected molecular tasks and weaker performance on broader regulatory inference, motivating scalable methods for causal feature annotation.