Retrospective Evaluation of an AI System to Classify Negative Musculoskeletal Trauma Radiographs
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Study objectives
To evaluate an AI system for musculoskeletal (MSK) radiography in confidently identifying examinations without injury-related pathologies to support AI-guided discharge of patients without traumatic findings.
Methods
We retrospectively sampled radiographic examinations, including one or more radiographs and a radiological report, of suspected MSK trauma from > 1,000 clinical sites in two countries. Medically trained professionals independently classified all exams. When disagreements between the classification and the original radiological report arose, adjudication was performed by a third professional. Annotators were also asked to record their confidence level of each classification. The AI system analyzed all exams and assigned them to the categories AI Positive, AI Negative , and AI Negative (very high confidence) . Performance of the AI system was assessed using error rate, false negative rate (FNR), and qualitative review of misclassified exams.
Results
A total of 2,962 exams were included. The AI classified 27.6% (818/2,961; 95% CI: 26.0–29.3) of exams as AI Negative (very high confidence) . Of all exams, 0.7% (21/2,962; 95% CI: 0.0–1.0) were falsely classified as highconfidence negatives, corresponding to a false negative rate (FNR) of 2.0% (21/1,026; 95% CI: 0.0–2.9). Qualitative review of false negatives showed that the majority had no clinical consequence if correctly diagnosed during routine follow-up the following day, and no clearly high-risk exams were missed.
Conclusion
The AI system identified over one-quarter of MSK trauma radiographs as confidently negative with a very low rate of false negatives, performing on par or better than the reported standard-of-care. These results suggest potential for safe, AI-driven decision support and workflow optimization for the discharge of patients with clearly negative examinations.