Evaluation of Model Performance and Clinical Usefulness in Automated Rectal Segmentation in CT for Prostate and Cervical Cancer
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Precise delineation of the rectum is crucial in treatment planning for cancers in the pelvic region, such as prostate and cervical cancers. Manual segmentation is also still time-consuming and suffers from inter-observer variability. Since there are meaningful differences in rectal anatomy between males and females, incorporating sex-specific anatomical patterns can be used to enhance the performance of segmentations. Furthermore, recent deep learning advancements have provided promising solutions for automatically classifying patient sex from CT scans and leveraging this information for enhancing the accuracy of rectal segmentation. However, their clinical utility requires comprehensive validation against real-world standards. Methods: In this study, a two-stage deep learning pipeline was developed using CT scans from 186 patients with either prostate or cervical cancer. First, a CNN model automatically classified the patient’s biological sex from CT images in order to capture anatomical variations dependent on sex. Second, a sex-aware U-Net model performed automated rectal segmentation, allowing the network to adjust its feature representation based on the anatomical differences identified in stage one. The internal validation had an 80/20 train–test split, and 15% of the training portion was held out for validation to ensure balanced distribution regarding sex and diagnosis. Model performance was evaluated using spatial similarity metrics, including the Dice Similarity Coefficient (DSC), Hausdorff Distance, and Average Surface Distance. Additionally, a radiation oncologist conducted a retrospective clinical evaluation using a 3-point Likert scale. Statistical significance was examined using Wilcoxon signed-rank tests, Welch’s t-tests, and Mann–Whitney U test. Results: The sex-classification model attained an accuracy of 94.6% (AUC = 0.98, 95% CI: 0.96–0.99). Incorporation of predicted sex into the segmentation pipeline improved anatomical consistency of U-Net outputs. Mean DSC values were 0.91 (95% CI: 0.89–0.92) for prostate cases and 0.89 (95% CI: 0.87–0.91) for cervical cases, with no significant difference between groups (p = 0.12). Surface distance metrics calculated on resampled isotropic voxels showed mean HD values of 3.4 ± 0.8 mm and ASD of 1.2 ± 0.3 mm, consistent with clinically acceptable accuracy. On clinical evaluation, 89.2% of contours were rated as excellent, while 9.1% required only minor adjustments. Automated segmentation reduced the average contouring time from 12.7 ± 2.3 min manually to 4.3 ± 0.9 min. Conclusions: The proposed sex-aware deep learning framework offers accurate, robust segmentation of the rectum in pelvic CT imaging by explicitly modeling sex-specific differences in anatomical characteristics. This physiologically informed approach enhances segmentation performance and supports reliable integration of AI-based delineation into radiotherapy workflows to improve both contouring efficiency and clinical consistency.