A Method for Sensitivity Analysis of Automatic Contouring Algorithms Across Different MRI Contrast Weightings Using SyntheticMR

Lucas McCullum
Zayne Belal
Warren Floyd
Alaa Mohamed Shawky Ali
Natalie West
Samuel Mulder
Yao Ding
Jiaofeng Xu
Dan Thill
Nicolette O’Connell
Joseph Stancanello
Kareem A. Wahid
David T. Fuentes
Ken-Pin Hwang
Clifton D. Fuller

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Currently, a majority of institution-specific automatic MRI-based contouring algorithms are trained, tested, and validated on one contrast weighting (i.e., T2-weighted), however their actual performance within this contrast weighting (i.e., across different repetition times, TR, and echo times, TE) is under-investigated and poorly understood. As a result, external institutions with different scan protocols for the same contrast weighting may experience sub-optimal performance.

Purpose

The purpose of this study was to develop a method to evaluate the robustness of automatic contouring algorithms to varying MRI contrast weightings.

Methods

One healthy volunteer and one patient was scanned using SyntheticMR on the MR-Simulation device. The parotid and submandibular glands in these subjects were contoured using an automatic contouring algorithm trained on T2-weighted MRIs. For ground truth manual contours, two radiation oncology residents and one pre-resident physician were recruited and their STAPLE consensus was determined. A total of 216 different MRI TR and TE combinations were simulated across T1-, T2-, and PD-weighted contrast ranges using SyntheticMR’s post-processing software, SyMRI. Comparisons between automatic contouring algorithm contours and the ground truth were determined using the Dice similarity coefficient (DSC) and 95 ^th percentile Hausdorff distance (HD95).

Results

Notable differences in the automatic contouring model’s performance were seen across the contrast-weighted range, even within the T2-weighted range. Further, some models even performed as well or better across subsets of the T1-weighted range. The PD-weighted range saw the worst performance. The range of discrepancy in DSC and HD95 exceeded 0.2 and 3.66 mm, respectively, in some structures. In the T2-weighted contrast region where the model was trained, 100%, 40%, 24%, and 57% for the DSC in the left parotid, right parotid, left submandibular, and right submandibular gland, respectively, exceeded interobserver variability.

Conclusions

This study demonstrates the variable performance of MRI-based automatic contouring algorithms across varying TR and TE combinations. This methodology could be applied in future studies as a method for evaluating model sensitivity, out of distribution detection ability, and performance drift.

Version published to 10.1101/2025.01.10.25319895v1 on medRxiv
Jan 12, 2025

The Good, the Bad, and the Ugly: Segmentation-Based Quality Control of Structural Magnetic Resonance Images

This article has 5 authors:
1. Robert Dahnke
2. Polona Kalc
3. Gabriel Ziegler
4. Julian Grosskreutz
5. Christian Gaser
This article has no evaluationsLatest version Mar 1, 2025
Biparametric MRI and Strain Elastography for Improving Cognitive Targeting in Prostate Cancer Detection

This article has 2 authors:
1. Akshay Patil
2. Anand Venugopal
This article has no evaluationsLatest version Feb 27, 2025
Optimizing the Reduction of Streaking Artifacts in Routine Non-Contrast Chest CT with a Guided Diffusion Deep Learning Method

This article has 7 authors:
1. Jingxin Liu
2. Xinran Zhu
3. Zhangzhen Shi
4. Donghong An
5. Lihui Zu
6. Kailiang Cheng
7. Zhong Zhang
This article has no evaluationsLatest version Mar 14, 2025

Listed in

Abstract

Background

Purpose

Methods

Results

Conclusions

Article activity feed

Related articles

The Good, the Bad, and the Ugly: Segmentation-Based Quality Control of Structural Magnetic Resonance Images

Biparametric MRI and Strain Elastography for Improving Cognitive Targeting in Prostate Cancer Detection

Optimizing the Reduction of Streaking Artifacts in Routine Non-Contrast Chest CT with a Guided Diffusion Deep Learning Method