Diagnostic Accuracy of a Multi-Target Artificial Intelligence Service for the Simultaneous Assessment of 16 Pathological Features on Chest and Abdominal CT
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Chest, abdominal, and pelvic computed tomography (CT) with intrave-nous contrast is widely used for tumor staging, treatment planning, and therapy mon-itoring. The integration of artificial intelligence (AI) services is expected to improve diagnostic accuracy across multiple anatomical regions simultaneously. Purpose: To evaluate the diagnostic accuracy of a multi-target AI service in detecting 16 pathological features on chest and abdominal CT images. Methods: We conducted a retrospective study using anonymized CT data from an open dataset. A total of 229 CT scans were independently interpreted by four radiologists with more than 5 years of experience and analyzed by the AI service. Sixteen pathological features were assessed. AI errors were classified as minor, intermediate, or clinically significant. Diagnostic accuracy was evaluated using the area under the receiver operating characteristic curve (AUC). Re-sults: Across 229 CT scans, the AI service made 423 errors (11.5% of all evaluated fea-tures, n = 3664). False positives accounted for 262 cases (61.9%) and false negatives for 161 (38.1%). Most errors were minor (62.9%) or intermediate (31.7%), while clinically significant errors comprised only 5.4%. The overall AUC of the AI service was 0.88 (95% CI: 0.87–0.89), compared with 0.78–0.81 for radiologists. For clinically significant find-ings, the AI AUC was 0.90 (95% CI: 0.71–1.00). Diagnostic accuracy was unsatisfactory only for urolithiasis. Conclusions: The multi-target AI service demonstrated high di-agnostic accuracy for chest and abdominal CT interpretation, with most errors being clinically negligible; performance was limited for urolithiasis.