Deep Learning Based Automated HER2 Score Prediction Using Immunohistochemistry Histopathological Images: A Dual-Center Study

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background HER2 is a critical prognostic biomarker in breast cancer and associated with aggressive tumor biology. Current IHC scoring is subjective and labor-intensive. Deep learning has demonstrated success in histopathological image analysis, yet HER2 IHC automation remains underexplored. External-center validation is essential to establish clinical credibility and demonstrate robustness across diverse institutional practices and imaging protocols. Methods This dual-center retrospective study analyzed 135 HER2 IHC whole-slide images from 118 breast cancer patients labeled as 1+, 2+, or 3+ by standard clinical criteria. Two board-certified pathologists manually annotated tumor-enriched ROIs, which were tiled into non-overlapping 512x512 patches; tiles with >60% white background was excluded. Patches were harmonized using a modified Macenko color normalization and augmented during training. Six pretrained deep learning models (AlexNet, VGG16, ResNet34, DenseNet121, Inception, Swin Transformer) were trained with patient-level splits and evaluated on an independent test set using macro-averaged AUC and complementary metrics. Results The cohort included 118 patients with comparable age and largely similar baseline imaging/pathologic characteristics across groups, although clinical symptoms and lymph node status differed. On the independent test set, all models showed good discrimination for three-class HER2 grading, with AlexNet performing best (macro-AUC 0.971), followed by VGG16 (0.967). For AlexNet, per-class AUCs were 0.980 (1+), 0.955 (2+), and 0.979 (3+); most errors occurred between adjacent grades (1+/2+, 2+/3+). Grad-CAM highlighted strongly stained tumor regions driving predictions. Conclusion A dual-center deep learning framework enabled accurate automated three-class HER2 IHC grading from tumor-enriched WSI patches. This approach may assist pathologists by improving scoring consistency and flagging equivocal cases for reflex FISH.

Article activity feed