Global and regional accuracy of deep learning-based tumor segmentation from whole-body [¹⁸F]fluorodeoxyglucose PET/CT images
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background The number of [¹⁸F]fluorodeoxyglucose ([¹⁸F]FDG)-PET/CT scans performed has significantly increased in the last decade in line with the increasing trend of oncological malignancies. Such images, which signal high glucose-uptake areas are key in defining the extent of the disease, staging and response to therapy. Processing and evaluation of ([¹⁸F]FDG)-PET/CT scans, however, require manual annotation by well-trained specialists and above all time. In time and resource-constrained settings meeting the increasing demand for PET/CT scans has become challenging. The main goal of our study was to test the relationship between the volumes predicted by the deep learning algorithm and the manually segmented ones. The secondary objective goal was to measure the extent at which the predictive accuracy is associated with normal background uptake. Results The study sample included 1159 [¹⁸F]FDG-PET/CT scans from subjects with histologically confirmed diagnoses of lung cancer, lymphoma, and melanoma. 881 (70%) [¹⁸F]FDG-PET/CT scans were used as the training dataset and 232 (20%) scans were used as an internal validation dataset. A subsample of 116 (10%) [¹⁸F]FDG-PET/CT scans not used for training was used as the test dataset. The segmentation model was implemented with the nnU-Net convolutional network available in the MONAI framework. Model performance was measured with the Dice score. Correlation between manual and predicted segmentation was assessed using linear correlation. Totalsegmentator tool was used to identify lesions location and assess the tumor-to-background ratio (TBR) for quantitative analysis. Network achieved Dice scores of 0.805 (validation) and 0.784 (test), showing strong agreement with manual segmentations. Anatomical localization was successful in 74% of the 7914 detected lesions. High correlation (R=0.88, p<0.0001) was observed between predicted and ground truth volumes. Segmentation accuracy improved with higher TBRs, as lesions with TBR>2 had significantly better Dice scores than those with lower contrast (TBR ≤ 1–2 or ≤1). Conclusions These results are consistent with previous reports on PET-based segmentation, further validating nnU-Net as a reliable approach for detecting hypermetabolic lesions and assessing global disease burden in FDG-PET imaging. Moreover, the significant relationship between TBR and segmentation accuracy suggests the possibility of further improvements by integrating metabolic profile into the predictive model.