compaRe, an ultra-fast and robust suite for multiparametric screening, identifies phenotypic drug responses in acute myeloid leukemia
Curation statements for this article:-
Curated by eLife
Evaluation Summary:
This paper aims to address the current gap in the efficient analysis of large-scale multiparameter flow cytometry and other datasets. The authors offer a software toolkit with an efficient algorithm for comparing numerous samples at once. The study is well presented and is relevant to single cell analysis research.
(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (eLife)
- Cancer Biology (eLife)
Abstract
Multiparametric phenotypic screening of cells, for example assessing their responses to small molecules or knockdown/knockout of specific genes, is a powerful approach to understanding cellular systems and identifying potential new therapeutic strategies. However, automated tools for analyzing similarities and differences between a large number of tested conditions have not been readily available. Methods designed for clustering cells cannot identify differences between samples effectively. We introduce compa R e for ultra-fast and robust analysis of multiparametric high-throughput screening. Applying a mass-aware gridding algorithm using hypercubes, compa R e performs automatic and effective similarity comparison for hundreds to thousands of tests and provides information about the treatment effect. Particularly for screening data, compa R e is equipped with modules to remove various sources of bias.
Benchmarking tests show that compa R e can circumvent batch effects and perform a similarity analysis substantially faster than conventional analysis tools. Applying compa R e to high-throughput flow cytometry screening data, we were able to distinguish subtle phenotypic drug responses in a human sample and a genetically engineered mouse model with acute myeloid leukemia (AML). compa R e revealed groups of drugs with similar responses even though their mechanisms are distinct from each other. In another screening, compa R e effectively circumvented batch effects and grouped samples from AML and myelodysplastic syndrome (MDS) patients using clinical flow cytometry data.
Article activity feed
-
Evaluation Summary:
This paper aims to address the current gap in the efficient analysis of large-scale multiparameter flow cytometry and other datasets. The authors offer a software toolkit with an efficient algorithm for comparing numerous samples at once. The study is well presented and is relevant to single cell analysis research.
(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.)
-
Reviewer #1 (Public Review):
Comprehensive and unbiased multiparameter high-throughput screening by compaRe finds effective and subtle drug responses in AML models by Hajkariim et al introduces a pipeline for pre-processing and analyzing data from multiplex flow cytometry and other technologies. Preprocessing steps include algorithms for correcting common sources of bias in such data. Another key feature is a robust approach to measuring cell similarity across samples. Among the strengths are that the manuscript is well-written, the analysis pipeline is well-motivated, and illustrated with apt examples. The similarity measure is very interesting as well.
There are a few weaknesses as well. It is not completely clear to me how this pipeline agrees and disagrees with common practice in the field. References 1-3, cited to document ongoing …
Reviewer #1 (Public Review):
Comprehensive and unbiased multiparameter high-throughput screening by compaRe finds effective and subtle drug responses in AML models by Hajkariim et al introduces a pipeline for pre-processing and analyzing data from multiplex flow cytometry and other technologies. Preprocessing steps include algorithms for correcting common sources of bias in such data. Another key feature is a robust approach to measuring cell similarity across samples. Among the strengths are that the manuscript is well-written, the analysis pipeline is well-motivated, and illustrated with apt examples. The similarity measure is very interesting as well.
There are a few weaknesses as well. It is not completely clear to me how this pipeline agrees and disagrees with common practice in the field. References 1-3, cited to document ongoing analytic challenges, are all at least 5 years old. Comparisons to other approaches, including the use Jensen-Shannon Divergence for similarity, make a convincing case that the proposed method is both effective and computationally efficient, but it is not clear if the comparators represent true standard of practice, or mere straw men. Methodologies are complex and can be difficult to follow, especially the similarity measure.
-
Reviewer #2 (Public Review):
In this manuscript, Hajkarim et al developed compaRe, a user friendly software suite (written in R) for analyzing high-throughput, multi-parameter screening data. There are several modules included in the compaRe toolkit, which can be individually invoked to perform specific tasks, such as quality control, bias correction, pairwise comparisons, clustering and data visualization.
Strengths:
1 All of these modules are available as command-line version and a GUI version for users to use in data analysis, visualization and results interpretation.
2. The authors showed the utility of their toolkit in analyzing multiparameter mass and flow cytometric data from AML and MDS patient samples. Through this analysis using compaRe, the authors showed that they can identify patient heterogeneity and drug response profiles.
Reviewer #2 (Public Review):
In this manuscript, Hajkarim et al developed compaRe, a user friendly software suite (written in R) for analyzing high-throughput, multi-parameter screening data. There are several modules included in the compaRe toolkit, which can be individually invoked to perform specific tasks, such as quality control, bias correction, pairwise comparisons, clustering and data visualization.
Strengths:
1 All of these modules are available as command-line version and a GUI version for users to use in data analysis, visualization and results interpretation.
2. The authors showed the utility of their toolkit in analyzing multiparameter mass and flow cytometric data from AML and MDS patient samples. Through this analysis using compaRe, the authors showed that they can identify patient heterogeneity and drug response profiles.
3. Overall, this is a well organized and written manuscript describing the development of the new compaRe toolkit. The method is clearly described, and the user manual/tutorial is easy to follow.
4. It seems like compaRe will be a useful toolkit for the research community, which is eager for a one-stop pipeline for analyzing high-throughout multiparameter screening data.
Weakness:
1. However, the current manuscript lacks comparison with other existing tools/methods in analyzing mass and flow cytometric data.
-
Reviewer #3 (Public Review):
Hajkarim et al. implement an algorithm in their presented toolkit compaRe to compare samples based on the similarities of samples, distinct from the more commonly used meta-clustering approaches, such as PhenoGraph, or dimensional reduction with Jenssen-Shannon Divergence analysis. Similarities among samples are calculated based on the proportions of cells within a sample belonging to an n-dimensional "hypercubes" (or "hypergridding" that is actually mass-aware and not blind) that are stratified by expression levels for n number of markers. The authors demonstrate that this method is much more time-efficient, obviates subsampling, and is robust to batch effects. This method is particularly appropriate for large-scale datasets, facilitating the comparison of numerous samples which would be helpful in …
Reviewer #3 (Public Review):
Hajkarim et al. implement an algorithm in their presented toolkit compaRe to compare samples based on the similarities of samples, distinct from the more commonly used meta-clustering approaches, such as PhenoGraph, or dimensional reduction with Jenssen-Shannon Divergence analysis. Similarities among samples are calculated based on the proportions of cells within a sample belonging to an n-dimensional "hypercubes" (or "hypergridding" that is actually mass-aware and not blind) that are stratified by expression levels for n number of markers. The authors demonstrate that this method is much more time-efficient, obviates subsampling, and is robust to batch effects. This method is particularly appropriate for large-scale datasets, facilitating the comparison of numerous samples which would be helpful in screening efforts. The manuscript is written and presented well.
Major strengths:
1. The study demonstrates sufficiently strong support for the toolkit's ability to determine similarity across samples and its computing efficiency with Figure 2, an important advantage of this tool.
2. Compared to other approaches, the method is advantageous for identifying groups of samples that may be similar in a very large-scale dataset. CompaRe does not require (or make use of) manual expert annotation of meta-clusters. The workflow is efficient and unbiased.Major weakness:
While the toolkit may clearly be useful in evaluating similarities across many samples, it does not seem to have clearly demonstrated its utility in exploring specific phenotypes in-depth within a high-parameter dataset.
-
-
-
-