Functional Subpopulations of Hematopoietic Stem Cells and Multipotent Progenitors Classification Using Transfer Learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background The functional classification of hematopoietic stem cells (HSCs) and multipotent progenitors (MPPs) is central to understanding hematopoiesis and developing regenerative therapies. Traditional fluorescence-activated cell sorting (FACS) has been the gold standard for distinguishing these subpopulations. However, it still remains labor-intensive, technically demanding, and limited in scalability. Automated, image-based approaches offer a promising alternative, yet their application to hematopoietic stem and progenitor cells has been constrained by the lack of large, annotated datasets and standardized analytic frameworks. Methods We present the largest publicly available microscopy dataset of hematopoietic stem and progenitor cells to date, encompassing three biologically distinct subpopulations: long-term HSCs (LT-HSCs), short-term HSCs (ST-HSCs), and MPPs. To analyze this resource, we developed a deep learning framework based on transfer learning using DenseNet architectures. A novel preprocessing strategy transformed multi-slice grayscale microscopy data into RGB composites, facilitating compatibility with pre-trained convolutional neural networks (CNNs). Two complementary pipelines were designed: (i) an image-level pipeline to classify entire microscopy fields and (ii) a cell-level pipeline incorporating Laplacian of Gaussian (LoG)–based blob detection for single-cell segmentation. Each model was trained and validated using stratified splits, with extensive data augmentation to enhance generalization. Results Among all architectures evaluated, DenseNet169 achieved the highest performance, attaining an area under the receiver operating characteristic curve (AUROC) of 99.5% and a balanced accuracy of 89.3% in the cell-level classification task. The model effectively distinguished LT-HSC, ST-HSC, and MPP populations, substantially outperforming previously reported single-channel or non-segmented approaches. Grad-CAM visualization confirmed that the model’s discriminative focus aligned with biologically relevant cellular regions, supporting interpretability and reproducibility. Comparative analyses demonstrated that integrating multi-channel image representation and optimized segmentation markedly enhanced accuracy and robustness. Conclusion This work introduces a reproducible and open-source deep learning framework for hematopoietic stem and progenitor cell classification. By integrating multi-channel imaging, transfer learning, and explainable AI, the proposed approach establishes a scalable, label-free alternative to conventional FACS, paving the way for high-throughput and automated phenotyping in hematopoietic and regenerative medicine research.