Privacy-Preserving Large Language Model Deployment for Oncology Registry Abstraction: Structure-Aware Evaluation in a Real-World Clinical Setting

Ruslan Enikeev
Max Moldovan
Megan Chu
Anisha Amalraj
Prajakta Prashant Koli
Shabbir Syed Abdul
Huren Sivaraj
Usman Iqbal
Chee Keong Toh

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Structuring oncology clinical notes into registry-grade variables is essential for research and care but remains labour-intensive and error-prone.

Objective

To develop and evaluate a privacy-preserving large language model pipeline for oncology registry abstraction in a real-world clinical setting.

Methods

We deployed an open-source Meta Llama 3.3 70B–based pipeline to extract over 50 variables from 6,700 oncology notes at a cancer centre in Singapore. Data were de-identified locally using a Hide-In-Plain-Sight approach, ensuring no identifiable data left hospital infrastructure. Performance was assessed on 200 randomly sampled notes with adjudicated ground truth. A structure-aware framework classified outputs as correct, missing, spurious, or incorrect.

Results

F1 scores were high across variables, including diagnosis (97.2%), histology (95.8%), stage (92.6%), biomarkers (91.4%), and treatments (88.1%). Transferability testing on 50 external notes showed strong performance for core variables.

Conclusions

Privacy-preserving LLMs can achieve near–human-level accuracy for oncology abstraction, with structure-aware evaluation enabling more clinically meaningful assessment.

Version published to 10.64898/2026.05.18.26353541 on medRxiv
May 21, 2026

Large language models for cancer registry abstraction: a real-world evaluation across models, variables, and cancer types

This article has 16 authors:
1. Joshua T Fuchs
2. Matthew J Satusky
3. Peter J Leese
4. Subhadeep Nag
5. Isaiah W Zipple
6. Chris D Baggett
7. Sydney Lash
8. Katherine Reeder-Hayes
9. William A Wood
10. Cara T Johnson
11. Claire Critchley
12. Ashok K Krishnamurthy
13. Jennifer Elston Lafata
14. Caroline A Thompson
15. Melissa A Troester
16. Emily R Pfaff
This article has no evaluationsLatest version Jun 29, 2026
General-purpose large language models can achieve physician-level accuracy in complex medical data extraction

This article has 2 authors:
1. Manu Rajeev
2. Ananthu Narayan
This article has no evaluationsLatest version Jun 10, 2026
Automated Disease Activity Assessment in Systemic Lupus Erythematosus Using Privacy-Preserving Large Language Models

This article has 14 authors:
1. Danting Zhang
2. Renee L. Leung
3. Chun-Ka Wong
4. Shirley Chiu Wai Chan
5. Yi Li
6. Eric H. M. Tang
7. Tingting Wu
8. Tak Mao Chan
9. Chak-Sing Lau
10. Carlos King Ho Wong
11. Kathy Sze Man Leung
12. Zoie Shui-Yee Wong
13. Joseph Tsz-Kei Wu
14. Desmond Yat-Hin Yap
This article has no evaluationsLatest version Jul 10, 2026

Discuss this preprint

Listed in

Abstract

Background

Objective

Methods

Results

Conclusions

Article activity feed

Related articles

Large language models for cancer registry abstraction: a real-world evaluation across models, variables, and cancer types

General-purpose large language models can achieve physician-level accuracy in complex medical data extraction

Automated Disease Activity Assessment in Systemic Lupus Erythematosus Using Privacy-Preserving Large Language Models