Dual-stage AI system for Pathologist-Free Tumor Detection and subtyping in Oral Squamous Cell Carcinoma

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Accurate histological grading of oral squamous cell carcinoma (OSCC) is critical for prognosis and treatment planning. Current methods lack automation for OSCC detection, subtyping, and differentiation from high-risk pre-malignant conditions like oral submucous fibrosis (OSMF). Further, analysis of whole-slide image (WSI) analysis is time-consuming and variable, limiting consistency. We present a clinically relevant deep learning framework that leverages weakly supervised learning and attention-based multiple instance learning (MIL) to enable automated OSCC grading and early prediction of malignant transformation from OSMF.

Methods

We conducted a multi-institutional retrospective cohort study using a curated dataset of 1,925 whole-slide images (WSIs), including 1,586 OSCC cases stratified into well-, moderately-, and poorly-differentiated subtypes (WD, MD, and PD), 128 normal controls, and 211 OSMF and OSMF with OSCC cases. We developed a two-stage deep learning pipeline named OralPatho . In stage one, an attention-based multiple instance learning (MIL) model was trained to perform binary classification (normal vs OSCC). In stage two, a gated attention mechanism with top-K patch selection was employed to classify the OSCC subtypes. Model performance was assessed using stratified 3-fold cross-validation and external validation on an independent dataset.

Findings

The binary classifier demonstrated robust performance with a mean F1-score exceeding 0.93 across all validation folds. The multiclass model achieved consistent macro-F1 scores of 0.72, 0.70, and 0.68, along with AUCs of 0.79 for WD, 0.71 for MD, and 0.61 for PD OSCC subtypes. Model generalizability was validated using an independent external dataset. Attention maps reliably highlighted clinically relevant histological features, supporting the system’s interpretability and diagnostic alignment with expert pathological assessment.

Interpretation

This study demonstrates the feasibility of attention-based, weakly supervised learning for accurate OSCC grading from whole-slide images. OralPatho combines high diagnostic performance with real-time interpretability, making it a scalable solution for both advanced pathology labs and resource-limited settings.

Article activity feed