Evaluating the learnability of single-cell large language models on multiple tasks

Yu Yan
Xutao Wang
Dongyuan Song

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The rise of single-cell foundation models (scFMs) has sparked interest in their potential to unify diverse biological tasks. However, their practical utility and the validity of scaling laws—the assumption that performance improves with model and data size—remain under-examined. Here, we systematically evaluate two representative scFMs, Geneformer and scGPT, across perturbation prediction and cell type annotation tasks. Our findings suggest that the benefits of large-scale pretraining are strongly task-dependent, conferring substantial advantages in cell type annotation but limited gains in perturbation prediction. Furthermore, our results indicate that increasing model size does not guarantee improved performance and can even be detrimental, challenging the ``bigger is better'' paradigm. By comparing model performance on real versus synthetic data with different levels of complexity, our analysis suggests that for perturbation prediction, the tested scFMs capture little more than simple summary statistics and may struggle to learn complex biological interactions. These results highlight the need to move beyond scaling and toward developing models that integrate deeper biological knowledge. We suggest that a renewed focus on task-specific architectures and biologically-informed priors may be critical for unlocking the true potential of foundation models in single-cell biology.

Version published to 10.21203/rs.3.rs-8919408/v1 on Research Square
Mar 12, 2026

Systematic evaluation of single-cell foundation model interpretability reveals attention captures co-expression rather than unique regulatory signal

This article has 1 author:
1. Ihor Kendiukhov
This article has no evaluationsLatest version Mar 26, 2026
A Method for Identifying Predatory Journals Driven by Large Language Models

This article has 2 authors:
1. Fanrui Zhang
2. Ming Chen
This article has no evaluationsLatest version Mar 20, 2026
Large Language Models for Material Science: A Systematic Review

This article has 2 authors:
1. Cecília Coelho
2. Oliver Niggemann
This article has no evaluationsLatest version Apr 14, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Systematic evaluation of single-cell foundation model interpretability reveals attention captures co-expression rather than unique regulatory signal

A Method for Identifying Predatory Journals Driven by Large Language Models

Large Language Models for Material Science: A Systematic Review