Protein-based cell population discovery and annotation for CITE-seq data identifies cellular phenotypes associated with critical COVID-19 severity

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Technologies such as Cellular Indexing of Transcriptomes and Epitopes sequencing (CITE-seq) and RNA Expression and Protein sequencing (REAP-seq) augment unimodal single-cell RNA sequencing (scRNA-seq) by simultaneously measuring expression of cell-surface proteins using antibody derived oligonucleotide tags (ADT). These protocols have been increasingly used to resolve cellular populations that are difficult to infer from gene expression alone, and to interrogate the relationship between gene and protein expression at a single-cell level. However, the ADT-based protein expression component of these assays remains widely underutilized as a primary tool to discover and annotate cell populations, in contrast to flow cytometry which has used surface protein expression in this fashion for decades. Therefore, we hypothesized that computational tools used for flow cytometry data analysis could be harnessed and scaled to analyze ADT data. Here we apply Ozette Discovery™, a recently-developed method for flow cytometry analysis, to re-analyze a large (>400,000 cells) published COVID-19 CITE-seq dataset. Using the protein expression data alone, Ozette Discovery is able to identify granular, robust, and interpretable cellular phenotypes in a high-throughput manner. In particular, we identify a population of CLEC12A+CD11b+CD14- myeloid cells that are specifically expanded in patients with critical COVID-19, and can only be resolved by their protein expression profiles. Using the longitudinal gene expression data from this dataset, we find that early expression of interferon response genes precedes the expansion of this subset, and that early expression of PRF1 and GZMB within specific Ozette Discovery phenotypes provides a RNA biomarker of critical COVID-19. In summary, Ozette Discovery demonstrates that taking a protein-centric approach to cell phenotype annotation in CITE-seq data can achieve the potential that dual RNA/protein assays provide in mixed samples: instantaneous in silico flow sorting, and unbiased RNA-seq profiling.

HIGHLIGHTS

  • Ozette Discovery provides an alternative method for data-driven annotation of granular and homogeneous cell phenotypes in CITE-seq data using protein expression data alone.

  • Our approach inherently accommodates for batch effects, and our novel background-normalization method improves the signal:noise ratio of these notoriously noisy protein measurements.

  • While these subpopulations are not derived from RNA profiles, they have distinct and interpretable RNA signatures.

  • We find a population of CLEC12A+CD11b+CD14- myeloid cells associated with critical COVID-19 severity that can only be identified by their protein profiles, and identify early expression of interferon response genes in a CD4 T cell subset as a predictor of CLEC12A+CD11b+CD14- cell expansion.

  • Peforming differential expression analysis within our identified phenotypes reveals predictors of COVID-19 severity that are not found with coarser annotations.

Article activity feed