Inflammation-Linked Aging Signals in Frozen Single-Cell Foundation Models: Donor-Aware Detection and Robustness Testing

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Single-cell foundation models offer a possible route to studying aging biology in latent space, but apparent age effects can be distorted by donor identity and cell-composition differences. We developed a donor-aware interpretability workflow to test whether frozen scGPT and Geneformer representations contain biologically coherent aging signals in age-labeled human single-cell datasets. The workflow combined donor-held-out age decoding, sparse autoencoder feature discovery, cross-model pathway matching, targeted latent-space interventions, donor-bootstrap confidence intervals, and progressively stricter confound controls. Across five datasets, frozen representations contained detectable age information, with best donor-aware balanced accuracy of 0.384. Sparse autoencoders identified 132 donor-aware robust features, and cross-model pathway matching produced 193 paired features, with the strongest convergence in inflammation and NF-kappaB-related programs. The clearest intervention signal was observed in a global Geneformer inflammatory branch, where old-versus-random and old-versus-young contrasts remained positive after split expansion and donor-threshold tightening up to 400 donors. In a monocyte-restricted analysis, both scGPT and Geneformer also showed positive old-versus-random responses in one cohort. The strongest global signal weakened under stricter controls. Fully composition-matched forward-pass reruns yielded 0 of 4 full strict replications. These results indicate that frozen single-cell foundation models do capture biologically plausible aging-related structure, especially around inflammatory programs, but also that donor-aware and composition-aware stress tests are necessary before interpreting such signals as robust mechanisms.

Article activity feed