A Graph-Attentive GAN for Rare-Cell-Aware Single-Cell RNA-Seq Data Generation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

A central challenge in downstream single-cell RNA sequencing (scRNA-seq) analysis is the high-dimensional, small-sample (HDSS) regime, often compounded by class imbalance from rare cell types. These factors hinder robust feature (gene) selection and cell clustering and limit the realism of samples generated by existing simulators. We introduce GARAGE , a G raph- A ttentive RA re-cell aware single-cell data GE neration that augments the generator’s input with a small, attention-weighted ‘leakage’ of real cells in addition to prior noise. Specifically, we build a k -nearest-neighbour cell graph and use a graph attention network (GAT) to prioritize nodes that likely represent under-sampled (rare) subpopulations; these high-attention cell embeddings are injected into the generator input to steer synthesis toward biologically plausible regions of the data manifold while respecting cell-type proportions. This attention-guided leakage accelerates training, reduces mode dropping, and yields realistic synthetic cells that preserve rare-cell structure. Across real scRNA-seq benchmarks, GARAGE improves downstream feature selection and clustering compared with state-of-the-art baselines. In summary, GARAGE directly addresses HDSS and rarity in scRNA-seq by coupling graph attention with adversarial generation to produce high-fidelity synthetic cells that enhance downstream analyses.

Article activity feed