Interpreting Attention Mechanisms in Genomic Transformer Models: A Framework for Biological Insights

Micaela E. Consens
Ander Diaz-Navarro
Vivian Chu
Lincoln Stein
Housheng Hansen He
Alan Moses
Bo Wang

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Transformer models have shown strong performance on biological sequence prediction tasks, but the interpretability of their internal mechanisms remains underexplored. Given their application in biomedical research, understanding the mechanisms behind these models’ predictions is crucial for their widespread adoption. We introduce a method to interpret attention heads in genomic transformers by correlating per-token attention scores with curated biological annotations, and we use GPT-4 to summarize each head’s focus. Applying this to DNABERT, Nucleotide Transformer, and scGPT, we find that attention heads learn biologically meaningful associations during self-supervised pre-training and that these associations shift with fine-tuning. We show that interpretability varies with tokenization scheme, and that context-dependence plays a key role in head behaviour. Through ablation, we demonstrate that heads strongly associated with biological features are more important for task performance than uninformative heads in the same layers. In DNABERT trained for TATA promoter prediction, we observe heads with positive and negative associations reflecting positive and negative learning dynamics. Our results offer a framework to trace how biological features are learned from random initialization to pre-training to fine-tuning, enabling insight into how genomic foundation models represent nucleotides, genes, and cells.

Version published to 10.1101/2025.06.26.661544v1 on bioRxiv
Jun 27, 2025

Quantifying uncertainty in Protein Representations Across Models and Task

This article has 2 authors:
1. R Prabakaran
2. Y Bromberg
This article has no evaluationsLatest version May 6, 2025
Biological Reasoning with Reinforcement Learning through Natural Language Enables Generalizable Zero-Shot Cell Type Annotations

This article has 4 authors:
1. Xi Wang
2. Runzi Tan
3. Bo Wang
4. Simona Cristea
This article has no evaluationsLatest version Jun 24, 2025
MAGELLAN: Automated Generation of Interpretable Computational Models for Biological Reasoning

This article has 6 authors:
1. Matthew A. Clarke
2. Charlie George Barker
3. Yuxin Sun
4. Theodoros I. Roumeliotis
5. Jyoti S. Choudhary
6. Jasmin Fisher
This article has no evaluationsLatest version May 19, 2025

Listed in

Abstract

Article activity feed

Related articles

Quantifying uncertainty in Protein Representations Across Models and Task

Biological Reasoning with Reinforcement Learning through Natural Language Enables Generalizable Zero-Shot Cell Type Annotations

MAGELLAN: Automated Generation of Interpretable Computational Models for Biological Reasoning