scRegulate: Single-Cell Regulatory-Embedded Variational Inference of Transcription Factor Activity from Gene Expression
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
Accurately inferring transcription factor (TF) activity from single-cell RNA sequencing (scRNA-seq) data remains a fundamental challenge in computational biology. While existing methods rely on statistical models, motif enrichment, or prior-based inference, they often suffer from deterministic assumptions about regulatory relationships, reliance on static regulatory databases, or lack of interpretability. Moreover, few approaches can effectively integrate prior biological knowledge with data-driven inference to capture novel, dynamic, and context-specific regulatory interactions.
Results
To address these limitations, we develop scRegulate, a generative deep learning framework that leverages variational inference to infer TF activities while incorporating gene regulatory network (GRN) priors. By integrating structured biological constraints with a probabilistic latent space model, scRegulate offers a scalable and biologically interpretable solution for prediction of regulatory interactions from scRNA-seq data. We comprehensively benchmark scRegulate using multiple public experimental and synthetic datasets generated from GRouNdGAN to demonstrate its ability to infer TF activities and GRNs that are consistent with the underlying ground-truth regulatory interactions. scRegulate outperforms existing TF inference methods, achieving AUROC values of 0.71-0.86 and AUPRC values of 0.80-0.95 on three synthetic datasets. Additionally, scRegulate accurately recapitulates experimentally validated TF knockdown effects on a Perturb-seq dataset, achieving a mean log2 fold change of - 0.66 to -16.98 (p ≤ 8.06×10 −13 ) for key TFs such as ELK1, EGR1, and CREB1. Applied to the PBMC scRNA-seq data, scRegulate reconstructs cell-type-specific GRNs and identifies differentially active TFs that align with known immune regulatory pathways. Furthermore, we show that scRegulate’s TF embeddings capture meaningful transcriptional heterogeneity, enabling accurate clustering of cell types. Collectively, our results establish scRegulate as a powerful, interpretable, and scalable framework for inferring TF activities and regulatory networks from single-cell transcriptomics.
Availability
All datasets and results are available on GitHub at github.com/YDaiLab/scRegulate .
Contact
yangdai@uic.edu
Supplementary information
Supplementary data are available at Bioinformatics online.