Generation and Evaluation of Realistic Synthetic Clinical Progress Notes for Prostate Cancer using Large Language Models

Álvaro Rey-Blanes
Francisco J. Moreno-Barea
Javier Veredas-Morente
Eloy Vivas-Vargas
Fátima Gil-García
Francisco J. Veredas

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background and Objective

Access to real-world electronic health records (EHRs) remains limited by privacy, governance and annotation constraints, hindering the development of clinical natural language processing models. Realistic synthetic progress notes may provide EHR-like corpora that preserve clinically rigorous information on diagnoses, treatments, symptoms, imaging, laboratory findings and therapeutic trajectories without relying directly on sensitive patient records. This study evaluates whether large language models (LLMs) can generate realistic Spanish prostate cancer progress notes from published case reports, preserving clinical content, temporality and hospital-style conventions.

Methods

We compiled 109 Spanish prostate cancer case reports from the biomedical literature and characterised their clinical content using Spanish biomedical named-entity recognition (NER) models, complemented by rule-based extraction of prostate specific antigen (PSA) values and Gleason scores. GPT-5.4 Nano, Qwen 3.5:35B A3B and GLM-5 were used to generate EHR-style progress notes from these case reports under plain-text and entity-enriched prompting strategies, in both zero-shot and few-shot settings. Evaluation combined lexical and semantic similarity metrics with structured LLM-as-a-judge assessment using Claude Sonnet 4.6, binary safety screening and expert clinical review.

Results

All models preserved substantial clinical content, although lexical-overlap metrics showed variable agreement with semantic and clinical quality assessments, reflecting the abstractive nature of the task. Entity-enriched prompting improved lexical and semantic align-ment, but did not consistently improve clinical safety. Qwen 3.5:35B A3B was unstable under entity-enriched few-shot prompting, showing increased safety-critical errors and contradictions. GPT-5.4 Nano achieved strong automatic scores but showed isolated clinical inconsistencies. GLM-5 showed the most robust overall profile and performed close to human-authored notes in expert review.

Conclusions

LLMs can generate clinically plausible Spanish prostate cancer progress notes from published case reports under controlled conditions. These findings support the potential use of EHR-like synthetic corpora for clinical NLP, although reliability remains model- and prompt-dependent. Expert validation and safety-oriented evaluation are therefore necessary before downstream use or clinical deployment.

Version published to 10.64898/2026.05.25.26354027 on medRxiv
May 28, 2026

Entity-Aware Generation of Synthetic Clinical Progress Notes for Prostate Cancer using Large Language Models

This article has 4 authors:
1. Álvaro Rey-Blanes
2. Javier Veredas-Morente
3. Francisco J. Moreno-Barea
4. Francisco J. Veredas
This article has no evaluationsLatest version Jun 15, 2026
Assessment of Zero-Shot Large Language Model (LLM) Assisted Clinical Trial Matching Processes: A Metastatic Cancer Use Case

This article has 10 authors:
1. Yingjie Weng
2. Himani Yalamaddi
3. Danning Fu
4. Ankita Mishra
5. Bryan J. Bunning
6. Andrew B. Martin
7. Jessica Hope
8. Vivek Charu
9. Allison Kurian
10. Manisha Desai
This article has no evaluationsLatest version Jul 10, 2026
Evaluating Large Language Models for Translating Multimodal Phenotype Documentations into Executable EHR Phenotyping Algorithms

This article has 12 authors:
1. Chao Yan
2. Yi Xin
3. Wu-Chen Su
4. Srushti Gangireddy
5. Shravani Durbhakula
6. Stephen P. Bruehl
7. Alyson L. Dickson
8. Lang Li
9. QiPing Feng
10. Bradley A. Malin
11. Tyler Derr
12. Wei-Qi Wei
This article has no evaluationsLatest version May 26, 2026

Discuss this preprint

Listed in

Abstract

Background and Objective

Methods

Results

Conclusions

Article activity feed

Related articles

Entity-Aware Generation of Synthetic Clinical Progress Notes for Prostate Cancer using Large Language Models

Assessment of Zero-Shot Large Language Model (LLM) Assisted Clinical Trial Matching Processes: A Metastatic Cancer Use Case

Evaluating Large Language Models for Translating Multimodal Phenotype Documentations into Executable EHR Phenotyping Algorithms