Robust and accurate diagnosis of infectious skin diseases from histopathology images by integrating deep learning and explainable AI
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate diagnosis of infectious skin diseases remains a major challenge, particularly for neglected tropical diseases such as mycetoma, where precise pathogen identification is crucial for effective treatment. Histopathology imaging is the diagnostic gold standard, involving examination of tissue biopsies to identify characteristic inflammatory patterns, cellular changes, or microbial pathogens. However, its analysis is often limited by variability in tissue sampling and staining, subjective interpretation, inter-observer differences, and the absence of visible microbial grains in early disease stages. To elevate these challenges, we develop the Skin INfectious Diseases Intelligent (SINDI) framework, an integrated machine learning pipeline combining shallow learning, deep learning, stain normalization, and explainable AI to automate and enhance diagnostic accuracy from histopathology images. The SINDI framework is designed to systematically tackle increasingly complex tasks in diagnostics, including (1) disease phenotype classification and pathogen species identification, (2) understanding the importance of disease-specific regions (grains) and classification of grain-free images lacking visible microbial structures, (3) semantic segmentation of pathological features, and (4) explainable AI-driven interpretable decision support. Leveraging a comprehensive dataset of 1,324 histopathology images representing four predominant mycetoma pathogens that are curated by expert pathologists, alongside 7,000 healthy skin tissue images, SINDI demonstrated near-perfect accuracy in binary and multi-class classification tasks, particularly when employing Macenko stain normalization and domain-specific features. Remarkably, SINDI achieved high accuracy on images with masked grain regions and even on grain-free images, which are considered diagnostically intractable by human experts. Semantic segmentation models accurately delineated phenotype-related regions, while explainable AI methods provided transparent and clinically relevant interpretability of model decisions. Our results indicate that diagnostically relevant information is distributed beyond visible lesion areas, challenging traditional pathology paradigms. The SINDI framework thus represents a significant advance in automated infectious skin disease diagnostics, offering robust, interpretable, and scalable decision-support tools adaptable to diverse clinical settings.