scPlantFormer: A Lightweight Foundation Model for Plant Single-Cell Omics Analysis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Foundation models have revolutionized single-cell omics data analysis and the increasing adoption of single-cell technologies in plant biology highlights the pressing need for efficient analytical tools. Developing a high-performance and lightweight foundation model for plant science is complex yet necessary. Inspired by the fact that the gene expression vector of cells contain less information-dense than the sentence, we offer a new perspective on pretraining single-cell omics foundation models and develop scPlantFormer, a model pretrained on one million Arabidopsis thaliana scRNA-seq data. Systematic benchmarking reveals that scPlantFormer excels in plant scRNA-seq analysis. Besides, two workflows are proposed to refine cell-type identification and significantly enhance the accuracy of inter-dataset cell-type annotation. scPlantFormer effectively integrates scRNA-seq data across species, identifying conserved cell types validated by the literature and uncovering novel ones. Additionally, it constructs a comprehensive Arabidopsis thaliana atlas with approximately 400,000 cells, positioning scPlantFormer as a powerful tool for plant single-cell omics.