Adaptive resampling for improved machine learning in imbalanced single-cell datasets

Zeinab Navidi
Akshaya Thoutam
Madeline Hughes
Srivatsan Raghavan
Peter S. Winter
Lorin Crawford
Ava P. Amini

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

While machine learning models trained on single-cell transcriptomics data have shown great promise in providing biological insights, existing tools struggle to effectively model underrepresented and out-of-distribution cellular features or states. We present a generalizable Adaptive Resampling (AR) approach that addresses these limitations and enhances single-cell representation learning by resampling data based on its learned latent structure in an online, adaptive manner concurrent with model training. Experiments on gene expression reconstruction, cell type classification, and perturbation response prediction tasks demonstrate that the proposed AR training approach leads to significantly improved downstream performance across datasets and metrics. Additionally, it enhances the quality of learned cellular embeddings compared to standard training methods. Our results suggest that AR may serve as a valuable technique for improving representation learning and predictive performance in single-cell transcriptomic models.

Version published to 10.1101/2025.11.04.686583 on bioRxiv
Nov 5, 2025

Accurate, scalable, and unified single-cell atlas integration with scBIOT

This article has 2 authors:
1. Haihui Zhang
2. Peiwu Qin
This article has no evaluationsLatest version Jan 19, 2026
Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

This article has 2 authors:
1. Xiuwei Zhang
2. Yuqi Cheng
This article has no evaluationsLatest version Dec 10, 2025
Self-supervised Graph Contrastive Learning for scRNA-seq Clustering

This article has 1 author:
1. Tong Wu
This article has no evaluationsLatest version Dec 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Accurate, scalable, and unified single-cell atlas integration with scBIOT

Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

Self-supervised Graph Contrastive Learning for scRNA-seq Clustering