scE2EGAE: Enhancing Single-cell RNA-Seq Data Analysis through an End-to-End Cell-Graph-Learnable Graph Autoencoder with Differentiable Edge Sampling

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Single-cell RNA sequencing (scRNA-Seq) technology reveals biological processes and molecular-level genomic information among individual cells. Numerous computational methods, including methods based on graph neural networks (GNNs), have been developed to enhance scRNA-Seq data analysis. However, existing GNNs-based methods usually construct fixed graphs by applying the k-nearest neighbors (KNN) algorithm, which may result in information loss. Methods: To address this problem, we propose scE2EGAE, which learns cell graphs during the training processes. Firstly, the scRNA-Seq data is fed into a deep count autoencoder (DCA). Secondly, the hidden representations of DCA are extracted and then used to generate cell-to-cell graph edges through a straight-through estimator based on top-k sampling and Gumbel-softmax. Finally, the generated cell-to-cell graph and scRNA-Seq data are fed into the GNNs-based downstream tasks. In this paper, we design a graph autoencoder which performs denoising on scRNA-Seq data as the downstream task. Results: We evaluate scE2EGAE on eight public scRNA-Seq datasets and compare its performance with seven existing scRNA-Seq data denoising methods. In this paper, extensive experiments are conducted, encompassing: 1) the evaluation of denoising performance, with metrics including mean absolute error (MAE), Pearson correlation coefficient (PCC), and cosine similarity (CS); 2) the assessment of clustering performance of the denoised results, utilizing adjusted rand index (ARI) and normalized mutual information (NMI); and 3) the evaluation of the cell trajectory inference performance of the denoised results, measured by the pseudo-temporal ordering score (POS). The results show that scE2EGAE outperforms most of the methods, proving that it can learn cell-to-cell graphs containing real information of cell-to-cell relationships. Conclusions: In this paper, we validate the proposed scE2EGAE method through its application to the denoising task of scRNA-Seq data. This method demonstrates its capability to learn inter-cellular relationships and construct cell-to-cell graphs, thereby enhancing downstream analysis of scRNA-Seq data. Our approach can serve as an inspiration for future research on scRNA-Seq analysis methods based on GNNs, holding broad application prospects.

Article activity feed