A novel deep optimal transport framework reveals prostate cancer risk heterogeneity, Alzheimer’s disease risk heterogeneity, and myeloma cells associated with a short-term sham bortezomib response and progressive disease
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Leveraging single-cell gene expression profiles can significantly enhance our understanding of diseases by associating single cells with traits such as disease subtypes, prognosis, and drug response. Although previous efforts have linked single cell clusters and groups with these attributes, they have primarily focused on changes in cell proportions while overlooking transcriptional changes at the single cell level. To further unravel cell heterogeneity with clusters and reveal the nuanced behaviors of cellular subtypes, it is essential to assess the disease associations of individual cells. Previous methods often fail to capture complex patterns that are only discernible through summarizing non-linear relationships across multiple genes. The Diagnostic Evidence GAuge of Single-cells/Spatial-transcriptomics (DEGAS) framework advances these efforts by aligning single cells and/or spatial transcriptomics regions with patients through a unified latent space using non-linear transformations learned from deep neural networks (DNNs). DEGAS achieves superior performance in analyzing single cell and spatial transcriptomics datasets, including Alzheimer’s disease (AD), multiple myeloma (MM), and prostate cancer (PDAC). Here we present DEGAS version 2, which has been updated with optimial transport based transfer learning and improved time-to-event loss functions, more advanced model architecture, and improved model baseline evaluations. DEGASv2 outperformed other methods in both single cell and spatial transcriptomic baseline comparisons. On the multiple myeloma discovery dataset, DEGASv2 enabled us to discover cell types that exhibited distinct drug response patterns over various time frames and were validated with time series single cell multiomic data that we generated, demonstrating a dangerous subtype of cell and novel therapeutic target.