scContrast: A contrastive learning based approach for encoding single-cell gene expression data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Single-cell RNA sequencing (scRNA-seq) captures gene expression at a individual cell resolution, which reveals critical insights into cellular diversity, disease processes, and developmental biology. However, a key challenge in scRNA-seq analysis is clustering similar cells across multiple batches, particularly when distinct sequencing protocols are used. In this work, we present scContrast, a semi-supervised contrastive learning method tailored for embedding scRNA-seq data from both plate- and droplet-based protocols into a universal representation space. By leveraging five simple augmentations, scContrast extracts biologically relevant signals from gene expression data while filtering out batch effects and technical artifacts. We trained scContrast on a subset of Tabula Muris tissues and evaluated its zero-shot performance on unseen tissues. Our results demonstrate that scContrast generalizes effectively to new tissues and outperforms the leading UCE approach in integrating scRNA-seq data from droplet- and plate-based sequencing protocols.

Article activity feed