ImmCellTyper: an integrated computational pipeline for systematic mining of Mass Cytometry data to assist deep immune profiling

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This valuable manuscript presents ImmCellTyper, a new toolkit for CyTOF data analysis. The semi-supervised clustering tool, BinaryClust, integrates prior biological knowledge and demonstrates competitive performance in various benchmarks, but there is room for strengthening the evidence base by addressing concerns about incomplete benchmarking results and the limited consideration of CyTOF markers with binary distribution. Overall, the manuscript offers solid potential for enhancing CyTOF data analysis methodologies.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Mass cytometry, also known as Cytometry by time-of-flight (CyTOF), is a cutting-edge high-dimensional technology for profiling marker expression at the single-cell level. This technology significantly advances clinical research in immune monitoring and the interrogation of immune cell populations. Nevertheless, the vast amount of data generated by CyTOF poses a daunting challenge for analysis. To address this, we describe ImmCellTyper (https://github.com/JingAnyaSun/ImmCellTyper), a novel and robust toolkit designed for CyTOF data analysis. The analytical framework incorporates an in-house developed semi-supervised clustering tool named BinaryClust, which first characterises main cell lineages, followed by in-depth interrogation for population of interest using unsupervised methods. BinaryClust was benchmarked with existing clustering tools and demonstrated superior accuracy and speed across two datasets comprising around 4 million cells, performing as good as manual gating by human experts. Furthermore, this computational pipeline provides a variety of visualization and analytical tools spanning from quality control to differential analysis, which can be tailored to user’s specific needs, aiming to provide a one-stop solution for CyTOF data analysis. The general workflow consists of five key steps: 1) Batch effect evaluation and correction, 2) Data quality control and pre-processing, 3) Main cell lineage characterisation and quantification, 4) Extraction and in-depth investigation of cell type of interest; 5) Differential analysis of cell abundance and functional marker expression (supporting multiple study groups). Overall, ImmCellTyper integrates expert’s biological knowledge in a semi-supervised fashion to accurately deconvolute well-defined main cell lineages, while also preserving the potential of unsupervised approaches to discover novel cell subsets and providing a user-friendly toolset to remove the analytical barrier for high-dimensional immune profiling.

Article activity feed

  1. eLife assessment

    This valuable manuscript presents ImmCellTyper, a new toolkit for CyTOF data analysis. The semi-supervised clustering tool, BinaryClust, integrates prior biological knowledge and demonstrates competitive performance in various benchmarks, but there is room for strengthening the evidence base by addressing concerns about incomplete benchmarking results and the limited consideration of CyTOF markers with binary distribution. Overall, the manuscript offers solid potential for enhancing CyTOF data analysis methodologies.

  2. Reviewer #1 (Public Review):

    Summary:

    This manuscript presented a useful toolkit designed for CyTOF data analysis, which integrates 5 key steps as an analytical framework. A semi-supervised clustering tool was developed, and its performance was tested in multiple independent datasets. The tool was compared to human experts as well as supervised and unsupervised methods.

    Strengths:

    The study employed multiple independent datasets to test the pipeline. A new semi-supervised clustering method was developed.

    Weaknesses:

    The examination of the whole pipeline is incomplete. Lack of descriptions or justifications for some analyses.

  3. Reviewer #2 (Public Review):

    Summary:

    The authors have developed marker selection and k-means (k=2) based binary clustering algorithm for the first-level supervised clustering of the CyTOF dataset. They built a seamless pipeline that offers the multiple functionalities required for CyTOF data analysis.

    Strengths:

    The strength of the study is the potential use of the pipeline for the CyTOF community as a wrapper for multiple functions required for the analysis. The concept of the first line of binary clustering with known markers can be practically powerful.

    Weaknesses:

    The weakness of the study is that there's little conceptual novelty in the algorithms suggested from the study and the benchmarking is done in limited conditions.

  4. Reviewer #3 (Public Review):

    Summary:

    ImmCellTyper is a new toolkit for Cytometry by time-of-flight data analysis. It includes BinaryClust, a semi-supervised clustering tool (which takes into account prior biological knowledge), designed for automated classification and annotation of specific cell types and subpopulations. ImmCellTyper also integrates a variety of tools to perform data quality analysis, batch effect correction, dimension reduction, unsupervised clustering, and differential analysis.

    Strengths:

    The proposed algorithm takes into account the prior knowledge.
    The results on different benchmarks indicate competitive or better performance (in terms of accuracy and speed) depending on the method.

    Weaknesses:

    The proposed algorithm considers only CyTOF markers with binary distribution.