Impact of Whole Slide Image Blurriness on the Robustness of Artificial Intelligence in Real World Setting: Retrospective Observational Study

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Context

In digital pathology, blurriness in whole slide images (WSI) is a common issue, with severe blurriness widely acknowledge as a critical factor that can degrade the performance of artificial intelligence (AI) models. However, the effects of the typical levels of blurriness observed in real-world pathological images on the robustness of AI predictions remains unclear and unexplored.

Objective

To evaluate the impact of WSI blurring on the robustness of AI prediction in real-world setting.

Design

A retrospective study was conducted using 8,000 WSIs and corresponding AI predictions from four AI models trained on data from two scanners and two organs. WSIs were categorized into concordant and discordant groups based on AI-prediction accuracy. Analyses included: 1) comparing blur metrics between groups, 2) determining the odds ratio between the proportions of blurry patch in WSIs and prediction concordance, and 3) assessing model performance across varying blur intensities.

Results

For each organ-scanner pair, the average wavelet score and Laplacian variance for WSIs between the two groups did not show a statistically significant difference model (p > 0.05 for both metrics), except for one, and their effect sizes were small (Cohen’s D < 0.2 for both metrics). Additionally, no statistically significant association was observed between AI prediction concordance and the proportion of blurry images in WSIs (confidence intervals included 1, respectively). Model performance remained robust even at high blur level (radius=1) at which patch image had Laplacian variance of 162.88 and a wavelet score of 1880.07, corresponding to the top 1.22% and 2.16% of blurriness respective, in our dataset.

Conclusions

The findings empirically suggest that the typical levels of WSI blurriness encountered in real-world settings may not significantly compromise the robustness of AI predictions.

Article activity feed