Artificial Intelligence Model’s assessment of intra-sample heterogeneity of HER2 IHC in breast cancer is related to interobserver and intraobserver agreement among pathologists

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Accurate assessment of HER2 status in breast cancer has been critical for guiding therapy and has become even more important with the emergence of antibody-drug conjugates, now also indicated in HER2-low tumors. However, inter- and intraobserver variability limits the reproducibility of HER2 IHC scoring among pathologists, particularly at the lower end of the expression spectrum. Artificial intelligence (AI) models offer potential to standardize and improve diagnostic accuracy and bring new insights into current practices shortcomings. Methods We conducted a study recruiting generalist and specialist pathologists from Rede D’Or centers across Brazil to assess digitized HER2 IHC whole slide images. The same images were presented for the pathologists with the interval of one month and to the AIM-HER2 (PathAI ®, Boston, MA) AI model. Intra- and interobserver agreement, as well as concordance with AI, were measured across 126 breast cancer samples. The association between sample features and agreement metrics was also analyzed using AI spatial breakdown data. Results Among 34 pathologists, median intraobserver concordance was 67.68% and median concordance with AI was 60.8%. Median interobserver agreement was 67.65%, with high agreement (> 85%) in 25.4% of samples. Significant positive correlations were observed among all agreement metrics. Samples with lower intra-sample heterogeneity, as determined by AI spatial breakdown scores, were associated with higher agreement levels. Conclusions Our findings highlight significant variability in HER2 IHC scoring among pathologists. AI models such as AIM-HER2 can be used to increase reproducibility of analysis and also to indicate the most difficult samples that can result in diagnostic discordance. We found that AI assessed intra-sample heterogeneity is correlated with a lower agreement rate among pathologists.

Article activity feed