Automated large language model screening of unstructured radiology reports for timely osteoporosis intervention.

Elena Tangtra
Tan Jin Rong
Natalia Fernandez Heng
Iffat Bin Mohammad Rafi
Manju Chandran

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Our objective was to evaluate the feasibility of a large language model (LLM)–based approach for classifying vertebral osteoporotic compression fractures (VOCFs) from unstructured lumbar spine radiology reports for integration into osteoporosis clinical workflows. Two open-source, on-premise LLM tools were assessed using a few-shot learning strategy to classify reports into five fracture subcategories and to identify “recent fractures for notification.” Performance was evaluated on two real-world datasets authored by 100 unique radiologists and benchmarked against a radiologist-defined reference standard. Both LLMs demonstrated satisfactory performance, achieving sensitivities above 0.70 and specificities above 0.80 across datasets, with one model showing higher sensitivity consistent with a screening role. Inter-run agreement was high (κ > 0.81), and performance was consistent across datasets from different timepoints (Model 1: χ² = 2.46, p = 0.12; Model 2: χ² = 1.50, p = 0.22). Common errors were related to temporal context and complex syntax. These findings demonstrate the feasibility of using open-source LLMs, without fine-tuning, to identify and flag recent osteoporotic fractures from unstructured radiology reports, supporting early referral for expedited osteoporosis screening and management.

Version published to 10.21203/rs.3.rs-8465573/v1 on Research Square
Feb 19, 2026

Diagnostic Performance of Large Language Models and Radiologists in Case-Based Radiology Questions

This article has 5 authors:
1. Raif Can Yarol
2. Ali Cantürk
3. Kenan Kadirli
4. Aslı Suner Karakulah
5. Oğuz Dicle
This article has no evaluationsLatest version Mar 13, 2026
Large Language Model–Assisted Radiology Reporting: A Retrospective Cohort Study Using the UTAUT Framework to Analyze Workflow Integration and Efficiency Gains

This article has 1 author:
1. Nelly Tan
This article has no evaluationsLatest version Feb 22, 2026
Developing a scalable pipeline for data extraction from clinical letters through resource-efficient prompt engineering

This article has 14 authors:
1. Ariel Yuhan Ong
2. Quang Nguyen
3. Ishani Barai
4. Justin Engelmann
5. Fares Antaki
6. Mertcan Sevgi
7. David A Merle
8. Lie Ju
9. Eliot Dow
10. Yukun Zhou
11. Gregory Maniatopoulos
12. Yemisi Takwoingi
13. Alastair K Denniston
14. Pearse A Keane
This article has no evaluationsLatest version Mar 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Diagnostic Performance of Large Language Models and Radiologists in Case-Based Radiology Questions

Large Language Model–Assisted Radiology Reporting: A Retrospective Cohort Study Using the UTAUT Framework to Analyze Workflow Integration and Efficiency Gains

Developing a scalable pipeline for data extraction from clinical letters through resource-efficient prompt engineering