External Validation of an Artificial Intelligence Triaging System for Chest X-Rays: A Retrospective Independent Clinical Study

André Coutinho Castilla
Iago de Paiva D’Amorim
Maria Fernanda Barbosa Wanderley
Mateus Aragão Esmeraldo
André Ricca Yoshida
Anthony Moreno Eigier
Márcio Valente Yamada Sawamura

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background: Chest radiography (CXR) is the most frequently performed radiological exam worldwide, but reporting backlogs, caused by a shortage of radiologists, remain a critical challenge in emergency care. Artificial intelligence (AI) triage systems can help alleviate this challenge by differentiating normal from abnormal studies and prioritizing urgent cases for review. This study aimed to externally validate TRIA, a commercial AI-powered CXR triage algorithm (NeuralMed, São Paulo, Brazil). Methods: TRIA employs a two-stage deep learning approach, comprising an image segmentation module that isolates the thoracic region, followed by a classification model trained to recognize common cardiopulmonary pathologies. We trained the system on 275,399 CXRs from multiple public and private datasets. We performed external validation retrospectively on 1045 CXRs (568 normal and 477 abnormal) from a teaching university hospital that was not used for training. We established ground truth using a large language model (LLM) to extract findings from original radiologist reports. An independent radiologist review of a 300-report subset confirmed the reliability of this method, achieving an accuracy of 0.98 (95% CI 0.978–0.988). We compared four ensemble decision strategies for abnormality detection. Performance metrics included sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUROC) with 95% CI. Results: The general abnormality classifier achieved strong performance (AUROC 0.911). Individual pathology models for cardiomegaly, pneumothorax, and effusion showed excellent results (AUROC of 0.968, 0.955, and 0.935, respectively). The weighted ensemble demonstrated the best balance, with an accuracy of 0.854 (95% CI, 0.831–0.874), a sensitivity of 0.845 (0.810–0.875), a specificity of 0.861 (0.830–0.887), and an AUROC of 0.927 (0.911–0.940). Sensitivity-prioritized methods achieving sensitivity >0.92 produced lower specificity (<0.69). False negatives were mainly subtle or equivocal cases, although many were still flagged as abnormal by the general classifier. Conclusions: TRIA achieved robust and balanced accuracy in distinguishing normal from abnormal CXRs. Integrating this system into clinical workflows has the potential to reduce reporting delays, prioritize urgent cases, and improve patient safety. These findings support its clinical utility and warrant prospective multicenter validation.

Version published to 10.3390/diagnostics15222899
Nov 15, 2025
Version published to 10.20944/preprints202510.1351.v1
Oct 17, 2025

Diagnostic Accuracy of a Multi-Target Artificial Intelligence Service for the Simultaneous Assessment of 16 Pathological Features on Chest and Abdominal CT

This article has 10 authors:
1. Valentin A. Nechaev
2. Nataliya Y. Kashtanova
3. Evgenii V. Kopeykin
4. Umamat M. Magomedova
5. Maria S. Gribkova
6. Anton V. Hardin
7. Marina I. Sekacheva
8. Varvara D. Sanikovich
9. Valeria Y. Chernina
10. Victor A. Gombolevskiy
This article has no evaluationsLatest version Nov 1, 2025
Diagnostic Accuracy of an Offline CNN Framework Utilizing Multi-View Chest X-Rays for Screening 14 Co-Occurring Communicable and Non-Communicable Diseases

This article has 10 authors:
1. Latika Giri
2. Pradeep Raj Regmi
3. Ghanshyam Gurung
4. Grusha Gurung
5. Shova Aryal
6. Sagar Mandal
7. Samyam Giri
8. Sahadev Chaulagain
9. Sandip Acharya
10. Muhammad Umair
This article has no evaluationsLatest version Nov 3, 2025
SHIELD: An AI Framework for Skeletal Health Intelligence and Early Lesion Detection to Improve Orthopedic Referrals

This article has 5 authors:
1. Abbas Alili
2. Ava M. McKane
3. Fatih M. Demir
4. Cynthia L. Emory
5. Metin N. Gurcan
This article has no evaluationsLatest version Nov 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Diagnostic Accuracy of a Multi-Target Artificial Intelligence Service for the Simultaneous Assessment of 16 Pathological Features on Chest and Abdominal CT

Diagnostic Accuracy of an Offline CNN Framework Utilizing Multi-View Chest X-Rays for Screening 14 Co-Occurring Communicable and Non-Communicable Diseases

SHIELD: An AI Framework for Skeletal Health Intelligence and Early Lesion Detection to Improve Orthopedic Referrals