scFPC-DE: Robust Differential Expression Analysis Along Single Cell Trajectories via Functional Principal Component Analysis

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Identifying temporally differentially expressed genes (TDEGs) along pseudotime trajectories from single cell RNA sequencing (scRNA-seq) data helps characterize the cellular states that underlie the dynamic process of cellular development. However, existing tests based on generalized additive models (GAMs) suffer from increased false positive rates under zero inflation caused by high dropouts, a ubiquitous technical artifact of scRNA-seq data. Furthermore, by testing each gene independently, existing tests ignore the variance-covariance structure shared across genes along the trajectory, leading to suboptimal power and reduced interpretability.

Results

We present scFPC-DE, a trajectory-based differential expression analysis (TDEA) method based on functional data analysis (FDA). It models the gene expression as a function of pseudotime in the L 2 space and represents the covariance structure of these functions by eigenfunctions derived from functional principal component (FPC) analysis. This approach effectively captures informative gene expression patterns along the trajectory, while mitigating the influence of zero inflation in both simulation and real data analysis. In simulations, scFPC-DE exhibited superior control of type I error and achieved the highest ROC-AUC among competing methods. When applied to an scRNA-seq dataset of B cell subtypes, scFPC-DE uniquely identified TDEGs enriched for B cell differentiation pathways, outperforming existing methods in biological relevance. These results show that scFPC-DE effectively captures the shared gene expression variation and pseudo-temporal structure along the single cell trajectory for TDEG identification.

Availability

R package and code vignettes are publicly available at https://github.com/LopezRicardo1/scFPCDE .

Contact

xing_qiu@urmc.rochester.edu ; yun.zhang@nih.gov .

Supplementary information

Supplementary data are available at Bioinformatics online

Article activity feed