CycleVI: Isolating cell cycle variation with an interpretable deep generative model
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cell cycle progression is a dominant source of variation in single-cell RNA-sequencing (scRNA-seq) data, often obscuring informative signals of cell identity and state. Current computational methods to address this problem either discard biologically relevant information through regression or require unspliced transcript data. This limits their applicability to most existing datasets. Here, we present CycleVI, a deep generative model that disentangles cell cycle variation from all other transcriptional signals in static scRNA-seq data by learning a partitioned latent representation with a dedicated circular subspace. CycleVI infers a continuous cell cycle phase, validated against orthogonal protein-level measurements, and produces a residual latent space free from cell cycle artifacts. We demonstrate that this disentangled representation unmasks meaningful biological heterogeneity in cancer cells and clarifies complex differentiation trajectories in hematopoietic progenitors. Furthermore, by applying CycleVI to spatially resolved transcriptomics data from a human breast cancer biopsy, we map proliferative activity in situ, revealing a clear demarcation between cycling tumor regions and surrounding quiescent tissue. CycleVI provides a principled approach to isolating, rather than removing, confounding cell cycle-related variation from widely available scRNA-seq data, enabling more robust analyses of cellular heterogeneity.