‘Trans-differentiation of neutrophils from plasmablast’ is an artefact caused by over-reliance on machine algorithms in single cell RNA sequencing analysis: Lesson learnt and steps ahead
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Single cell RNA sequencing (scRNA-seq) provides new opportunities to characterize gene expression for individual cells. However, the sparse nature of the scRNA-seq data with many zero counts or missing values presents a major challenge to its analysis. The presence of low-quality cells further complicates the analysis. Here, we showed that the trans-differentiation of plasmablasts (activated plasma cells) into neutrophils reported in COVID patients (Wilk et al., 2020 in Nature Medicine) was an artefact of trajectory analysis. It was caused by ∽30 low-quality cells linking the 2 cell populations that are of unrelated lineages in hematopoietic differentiation.
Such artefacts are not readily spotted during the current practice of peer reviews as the current statistical guidelines of most journals are not catered for big data such as that of scRNA-seq. New standards of statistics and quality control measures for machine algorithms are not in place and they are urgently needed to safeguard against over-interpretation of high dimensional data. We propose a comprehensive framework to ensure reproducibility in high-dimensional data analysis, emphasizing quality checks, sensitivity analyses, alternative and multiple algorithms validation. Finally, and most importantly, a hypothesis-driven research approach should be upheld.