scCoBench: Benchmarking single cell RNA-seq co-expression using promoter-reporter lines
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Single-cell RNA sequencing (scRNA-seq) has become a powerful tool for uncovering transcriptomic heterogeneity and reconstructing gene regulatory networks in complex tissues. However, the sparsity, high noise levels, and dropout events inherent to scRNA-seq data pose challenges for accurate inference of gene-gene relationships. In this study scCoBench, we systematically benchmark correlation metrics, pseudo bulk analysis, and imputation methods using promoter-reporter and native gene pairs as internal controls to evaluate the performance of ten widely used gene-gene co-expression measurements. Interestingly, we found that commonly used data scaling and normalization approaches lead to lower correlation between promoter reporter and native gene pairs in most of the co-expression methods. Moreover, we assess the impact of five popular imputation techniques, including scImpute, SAVER, Autoencoder (AE), Variational Autoencoder (VAE), and Generative Adversarial Network (GAN) on recovering biologically relevant co-expression patterns. Our results demonstrate that imputation models not only markedly enhance correlation between each promoter-reporter and native gene pair but also increase the number of cells co-expressing both genes. Imputation also improved transcription factor target gene correlations and revealed stronger associations among genes within the same protein complex. This work highlights the utility of promoter-reporter systems for benchmarking computational methods and underscores the potential of deep learning-based imputation to improve the biologically relevant signals of scRNA-seq data.