JIND-Multi: Leveraging Multiple Labeled Datasets for Automated Annotation of Single-Cell RNA and ATAC Data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
The creation of single-cell atlases is essential for understanding cellular diversity and heterogeneity. However, assembling these atlases is challenging due to batch effects and the need for accurate cell annotation. Current methods for single-cell RNA and ATAC sequencing, while effective for integration, are not optimized for cell annotation. Additionally, many annotation tools rely on external databases or reference scRNA-Seq datasets, which may limit their adaptability to specific study needs, especially for rare cell-types or scATAC-Seq data.
Results
We introduce JIND-Multi, an extended version of the JIND framework, designed to transfer cell-type labels across multiple annotated datasets. JIND-Multi significantly reduces the proportion of unclassified cells in single-cell RNA sequencing (scRNA-Seq) data while maintaining the accuracy and performance of the original JIND model. Furthermore, JIND-Multi demonstrates robust and precise annotation results in its inaugural application to scATAC-Seq data, proving its versatility and effectiveness across different single-cell sequencing technologies.
Conclusions
JIND-Multi represents an improvement in cell annotation, reducing unassigned cells and offering a reliable solution for both scRNA-Seq and scATAC-Seq data. Its ability to handle multiple labeled datasets enhances the precision of annotations, making it a valuable tool for the single-cell research community. JIND-Multi is publicly available at: https://github.com/ML4BM-Lab/JIND-Multi.git .