CellPhenoX: An eXplainable Cell-specific machine learning method to predict clinical Phenotypes using single-cell multi-omics
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Single-cell technologies have enhanced our knowledge of molecular and cellular heterogeneity underlying disease. As the scale of single-cell datasets expands, linking cell-level phenotypic alterations with clinical outcomes becomes increasingly challenging. To address this, we introduce CellPhenoX, an eXplainable machine learning method to identify cell-specific phenotypes that influence clinical outcomes. CellPhenoX integrates classification models, explainable AI techniques, and a statistical framework to generate interpretable, cell-specific scores that uncover cell populations associated with relevant clinical phenotypes and interaction effects. We demonstrated the performance of CellPhenoX across diverse single-cell designs, including simulations, binary disease-control comparisons, and multi-class studies. Notably, CellPhenoX identified an activated monocyte phenotype in COVID-19, with expansion correlated with disease severity after adjusting for covariates and interactive effects. It also uncovered an inflammation-associated gradient in fibroblasts from ulcerative colitis. We anticipate that CellPhenoX holds the potential to detect clinically relevant phenotypic changes in single-cell data with multiple sources of variation, paving the way for translating single-cell findings into clinical impact.