scPlantFormer: A Lightweight Foundation Model for Plant Single-Cell Omics Analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Foundation models have revolutionized single-cell omics data analysis and the increasing adoption of single-cell technologies in plant biology highlights the pressing need for efficient analytical tools. Developing a high-performance and lightweight foundation model for plant science is complex yet necessary. Inspired by the fact that the gene expression vector of cells contain less information-dense than the sentence, we offer a new perspective on pretraining single-cell omics foundation models and develop scPlantFormer, a model pretrained on one million Arabidopsis thaliana scRNA-seq data. Systematic benchmarking reveals that scPlantFormer excels in plant scRNA-seq analysis. Besides, two workflows are proposed to refine cell-type identification and significantly enhance the accuracy of inter-dataset cell-type annotation. scPlantFormer effectively integrates scRNA-seq data across species, identifying conserved cell types validated by the literature and uncovering novel ones. Additionally, it constructs a comprehensive Arabidopsis thaliana atlas with approximately 400,000 cells, positioning scPlantFormer as a powerful tool for plant single-cell omics.

Article activity feed