Privacy-Preserving Retrieval for Auditable Clinical Language Modeling on Real-World Radiology Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Clinical large language models adapted for real-world use are commonly fine-tuned on patient data, embedding confidential information within model parameters and limiting auditability and privacy. We evaluate a retrieval-based framework separating clinical data from the language model by storing patient records in externally governed memory. On a radiology report summarisation task, retrieval recovers 32–67% of fine-tuning gains (perplexity, ROUGE-L; p  < 0.0001), while hallucination reduction remains a future focus.

Article activity feed