Benchmarking single cell transcriptome matching methods for incremental growth of reference atlases

Joyce Hu
Beverly Peng
Ajith V. Pankajam
Bingfang Xu
Vikrant Anil Deshpande
Andreas Bueckle
Bruce W. Herr
Katy Börner
Christopher Dupont
Richard H. Scheuermann
Yun Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

The advancement of single cell technologies has driven significant progress in constructing a multiscale, pan-organ Human Reference Atlas (HRA) for healthy human cells, though challenges remain in harmonizing cell types and unifying nomenclature. Multiple machine learning and artificial intelligence methods, including pre-trained and fine-tuned models on large-scale atlas data, are publicly available for the single cell community users to computationally annotate and match their cell clusters to the reference atlas.

Results

This study benchmarks four computational tools for cell type annotation and matching – Azimuth, CellTypist, scArches, and FR-Match – using two lung atlas datasets, the Human Lung Cell Atlas (HLCA) and the LungMAP single-cell reference (CellRef). Despite achieving high overall performance while comparing algorithmic cell type annotations to expert annotated data, variations in accuracy were observed, especially in annotating rare cell types, underlining the need for improved consistency across cell type prediction methods. The benchmarked methods were used to cross-compare and incrementally integrate 61 cell types from HLCA and 48 cell types from CellRef, resulting in a meta-atlas of 41 matched cell types, 20 HLCA-specific cell types, and 7 CellRef-specific cell types.

Conclusion

This study reveals complementing strengths of the benchmarked methods and presents a framework for incremental growth of the cell type inventory in the reference atlases, leading to 68 unique cell types in the meta-atlas across CellRef and HLCA. The benchmarking analysis contributes to improving the coverage and quality of HRA construction by assessing the reliability and performance of cell type annotation approaches for single cell transcriptomics datasets.

Version published to 10.1101/2025.04.10.648034 on bioRxiv
Apr 16, 2025

Accurate, scalable, and unified single-cell atlas integration with scBIOT

This article has 2 authors:
1. Haihui Zhang
2. Peiwu Qin
This article has no evaluationsLatest version Jan 19, 2026
Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

This article has 2 authors:
1. Xiuwei Zhang
2. Yuqi Cheng
This article has no evaluationsLatest version Dec 10, 2025
Comprehensive benchmarking of RNA velocity methods across single-cell datasets

This article has 6 authors:
1. Jin Liu
2. Yida Wu
3. Chuihan Kong
4. Xu Liao
5. Zhixiang Lin
6. Xiaobo Sun
This article has no evaluationsLatest version Feb 2, 2026

Discuss this preprint

Listed in

Abstract

Background

Results

Conclusion

Article activity feed

Related articles

Accurate, scalable, and unified single-cell atlas integration with scBIOT

Discovering cell types and states from reference atlases with heterogeneous single-cell ATAC-seq features

Comprehensive benchmarking of RNA velocity methods across single-cell datasets