InferPloidy: Fast and accurate ploidy inference enables inter-tumoral biomarker discovery in single-cell RNA-seq datasets

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background:

Accurate inference of copy number variation (CNV) and ploidy from single-cell RNA-seq data is essential for resolving tumor heterogeneity and identifying malignant cells, yet existing tools such as CopyKat and SCEVAN are limited by long runtimes and reduced accuracy in large or heterogeneous datasets.

Results:

Here, we present InferPloidy, a high-speed and robust ploidy inference method built on InferCNV that combines graph-based cell-grouping with iterative Gaussian mixture modeling. Across multiple cancer types—breast cancer, non-small cell lung cancer, pancreatic ductal adenocarcinoma, and colorectal cancer—InferPloidy achieved up to two orders of magnitude faster runtimes than existing tools, while maintaining superior classification accuracy. This accurate separation of aneuploid tumor cells enabled the discovery of subtype-specific therapeutic targets, including ERBB2 , ESR1 , EGFR , and MET , as well as recurrent surfaceome markers such as CD82 , F11R , SLC2A1 , TM9SF2 , CXADR , and PLPP2 , several of which have preclinical or clinical relevance.

Conclusion:

These results establish InferPloidy as a scalable platform for CNV-guided tumor cell identification and surfaceome-based biomarker discovery, offering broad utility for precision oncology and translational research.

Article activity feed