Quantifying distribution shifts in single-cell data with scXMatch
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
A basic task that frequently arises when analyzing single-cell data is to assess if there is a global distribution shift between the data profiles of cells from two different conditions. Widely used approaches to address this task such as visual inspection of two-dimensional representations or clustering-based workflows lack a solid statistical underpinning and are notoriously unstable and prone to confirmation bias. To promote more rigorous analysis, we here present the scverse-compatible Python tool scXMatch. scXMatch is based on a non-parametric graph-based test to quantify distribution shifts in arbitrary data spaces for which a suitable distance measure is available. We evaluated scXMatch on single-cell gene expression, chromatin accessibility, and imaging-derived cell morphology data, showing that it can robustly detect distribution shifts for different types of single-cell data. scXMatch thus aims to set a new standard in the single-cell biology field, replacing easy-to-manipulate semi-manual distribution shift quantification workflows by principled statistical testing.