AnnotateAnyCell: Open-Source AI Framework for Efficient Annotation in Digital Pathology
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Importance
Manual annotation of histopathological images by pathologists is effort-intensive and represents a major challenge for computational analysis and clinical AI deployment.
Objective
To develop and validate an open-source semi-supervised deep learning framework for accurate and efficient annotation and analysis of cellular structures in whole slide images.
Design
Methods development study using active contrastive learning with iterative human-in-the-loop feedback. The framework integrates cell segmentation, embedding-based visualization, and semi-supervised classification.
Setting
Digital pathology computational analysis platform with intuitive web-based annotation interface.
Methods
Five whole slide images of canine invasive urothelial carcinoma digitized at 40× magnification, representing low, intermediate, and high histological grades with diverse morphological patterns. The AnnotateAnyCell framework processing cells through Cell-pose segmentation, UMAP dimensionality reduction for interactive annotation, and contrastive learning-based classification with pseudolabel generation.
Main Outcomes and Measures
Primary outcomes were accurate classification for nuclear morphology features (mitotic figures, nucleoli, chromatin, shape) and annotation efficiency measured as time per cell. Secondary outcomes included inter-annotator agreement and model performance scaling with training data size.
Results
The framework processed 8 slides containing hundreds of thousands of cells. Latent space clustering-based annotation required 47 minutes versus 63 minutes for sequential annotation (25% reduction, 95% CI 18%-32%). Classification accuracy reached 96.3% ± 1.2% for mitotic figures and 98.3% ± 1.4% for nucleoli with 1075 labeled samples. Nuclear shape classification achieved 59.5% ± 2.1% accuracy. Inter-annotator agreement was highest for chromatin (100%) and nucleoli (95%, κ = 0.95), moderate for mitotic figures (64%, κ = 0.58), and lowest for shape (81%, κ = 0.36). Performance scaled efficiently, with nucleoli classification reaching 95.5% ± 1.5% accuracy using only 215 samples.
Conclusions and Relevance
This semi-supervised active learning framework substantially reduces annotation burden while achieving expert-level accuracy for well-defined morphological features. The open-source tool accelerates histopathological dataset curation for cancer research and enables deployment of AI-assisted diagnostics in resource-constrained settings.
Github
https://github.com/shouryaverma/AnnotateAnyCell
Key Points
Question
Can a semi-supervised deep learning framework reduce annotation burden for pathologist while maintaining accuracy in digital pathology?
Findings
In this methods’ study evaluating eight canine urothelial carcinoma slides, the AnnotateAnyCell framework achieved 96% accuracy for mitotic classification and 98% for nucleoli classification using just 1075 labeled samples, while reducing annotation time by 25% compared to sequential annotation.
Meaning
This open-source framework enables efficient large-scale cellular annotation in histopathology, potentially accelerating cancer research and clinical application.