Triage with AI: A Rule-out Framework Quantifying the Risks and Benefits of Screening Mammogram Automation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
AI has been proposed as a triage or “rule out” device to reduce radiologist workload, but it is presently unclear how an AI triage threshold should be determined. We present a framework for determining an optimal threshold.
Materials and Methods
114,229 bilateral 2D digital screening mammograms were retrospectively analyzed from 2006-2023. All mammograms were given an AI score using Mirai, an open-source deep-learning model. Several metrics were examined using two thresholds for determining ruled out versus retained cases: 1) Caseload Reduce Rate (CRR; percent of caseload reduced due to rule-out), 2) Gross AI False Omission Rate (G-FOR; probability of a patient having breast cancer if ruled out), 3) AI Net False Omission Rate (N-FOR; probability of a patient having breast cancer if ruled out and the radiologist would have caught in standard care [i.e. no triage].), 4) AI Adjusted Net False Omission Rate (30%) (AN-FOR[30%]; N-FOR adjusted for the hypothetical scenario where radiologists detect an extra 30% of breast cancers among AI retained cases). The two thresholds were severity scores of 0.2 (Yuden’s J) and 0.05 (AN-FOR[30%]=0). The former is mathematically optimal; the latter reflects a threshold where AI triage does not introduce any total increase in False Negatives.
Results
At the 0.20 threshold, G-FOR, N-FOR, and AN-FOR(30%) were 0.26%, 0.017%, and 0.14%, respectively (223, 141, and 121, respectively, missed cancer cases) and CRR=75%. At the 0.05 threshold, the G-FOR, N-FOR, and AN-FOR (30%) are 0.12%, 0.07%, and 0.00% (49, 30, and 0, respectively, missed cancer cases) and CRR=36%.
Conclusion
We demonstrate how radiology practices can consider the trade-offs of using different AI scores triage thresholds. At the AN-FOR rate of 30%, the Yuden’s J threshold results in 121 additional missed cancers for a 75% caseload reduction. We estimate no additional missed cancers at a 36% caseload reduction.