Sparse Autoencoders Reveal Interpretable Features in Single-Cell Foundation Models

Flavia Pedrocchi
Florian Barkmann
Amir Joudaki
Valentina Boeva

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Single-cell foundation models (scFMs) hold promise for applications in cell type annotation and data integration, but their internal mechanisms remain poorly understood. We investigate the structure of these models by training sparse autoencoders (SAEs) on the hidden representations of two widely used scFMs, scGPT and scFoundation. The learned features reveal diverse and complex biological and technical signals, which emerge even in pre-trained models. We also observe that the encoding of this information differs between scFMs with distinct training protocols and architectures. Further, we find that while many features capture the information about cell types across several studies, they often fall short of unifying it into a single generalized representation. Finally, we demonstrate that SAE-derived features are causally related to model behavior and can be intervened upon to reduce unwanted technical effects while steering model outputs to preserve the core biological signal. These findings provide a path toward more interpretable and controllable single-cell foundation models.

Version published to 10.1101/2025.10.22.681631 on bioRxiv
Oct 23, 2025

Accurate, scalable, and unified single-cell atlas integration with scBIOT

This article has 2 authors:
1. Haihui Zhang
2. Peiwu Qin
This article has no evaluationsLatest version Jan 19, 2026
Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

This article has 2 authors:
1. Xiuwei Zhang
2. Yuqi Cheng
This article has no evaluationsLatest version Dec 10, 2025
Multi-View Autoencoder Framework with Feature Recalibration and Ensemble Learning for Predicting Heart Disease

This article has 2 authors:
1. Abulfadhel Amer Saihood Altufaili
2. Dunya Mohammed Shleej
This article has no evaluationsLatest version Dec 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Accurate, scalable, and unified single-cell atlas integration with scBIOT

Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

Multi-View Autoencoder Framework with Feature Recalibration and Ensemble Learning for Predicting Heart Disease