Identifying the Minimal Number of Protein Markers for Cell Type Annotation Using MiniMarS

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Over the past decade, there has been an explosion in the characterisation and discovery of cell populations using single-cell technologies. Single-cell multi-omics data, particularly those incorporating gene and protein expression, are increasingly commonplace and can lead to more refined characterisation of cell types. A common challenge for biologists is to isolate cells of interest using a minimal number of markers for cytometry experiments. Although several methods exist for marker selection, there is limited guidance on the relative performance of these methods, and a wrapper package that combines multiple methods is lacking. The method that performs best can vary depending on the dataset and it can be challenging for researchers to test multiple methods for a given dataset. To address these issues, we present MiniMarS ( Mini mal Mar ker S election), an R package that serves as a wrapper for 10 different algorithms. It allows users to determine the best-performing algorithm for identifying the optimal number of markers that will delineate cell populations in their dataset. MiniMarS uses pre-annotated cells with protein features from CyTOF or sequencing-based assays such as CITE-seq and Abseq as input. Outputs include 1) the minimum number of protein markers required to identify the annotated cell populations using a range of marker selection algorithms, and 2) a range of metrics to evaluate the performance of each algorithm. MiniMarS effectively differentiated populations across various datasets, including those from human blood, bone marrow, thymus, mouse spleen, and lymph nodes, even after subsampling over 41,000 cells to 2,500 cells. MiniMarS also identified 15 markers from CITE-seq data, which were then used to successfully identify the same 11 cell subsets in a CyTOF dataset (F1 score>0.9). Additionally, we showed that by appropriately combining clusters, MiniMarS improves the F1 score of a rare population identification (<1% of total cells) by 28.7%. Together, these findings highlight the broad applicability of MiniMarS in identifying appropriate markers for distinguishing cell populations.

Article activity feed