Non-inferiority of Automated Deep Learning-Based 18F-FDG PET/CT Tumour Volume Compared to Manual GTV for Prognostic Modelling in Head and Neck Cancer

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Manual segmentation of gross tumour volumes (GTV) on 18F-FDG PET/CT is time-consuming and subject to interobserver variability, limiting its scalability for prognostic modelling in head and neck cancer. We investigated whether deep learning-based PET tumour volumes (AI-PET-GTV) could replace manually defined GTVs in risk prediction models for loco-regional failure (LRF) and distant metastasis (DM). Results Using competing risk regression, we tested whether AI-PET-GTV was non-inferior to manual GTV in predicting LRF, with the primary outcome being area under the receiver operating characteristic curve (AUC) at 3 years, using a non-inferiority margin of 5 percentage points. AI-PET-GTV achieved a 3-year AUC of 72.9% (95% CI: 67.9–77.9%) compared to 72.8% (95% CI: 67.8–77.9%) for manual GTV (p = 0.02). At 1 year, AUCs were 77.3% (95% CI: 72.2–82.4%) and 76.9% (95% CI: 71.9–82.0%) for AI and manual GTV, respectively (p = 0.02). Similar patterns were observed for DM prediction at 1 and 3 years (all p < 0.01), and Brier scores also favoured AI-PET-GTV at both timepoints (p < 0.02). Stratification based on predicted risk yielded nearly identical cumulative incidence estimates. For example, the 3-year cumulative incidence of LRF in the high-risk group was 38.4% (95% CI: 32.6–44.2%) for both models. Conclusions Automated deep learning-based PET tumour volumes are non-inferior to manual GTVs for prognostic modelling of LRF and DM in head and neck cancer. These findings support clinical implementation of AI-derived volumes for reproducible, scalable, and earlier risk stratification in oncology workflows.

Article activity feed