Decoding Clinician Authorial Style: A Style-Informed Pipeline for Clinical Document Summary Generation with Large Language Models

Scott Zhao
Abbas Alili
Usman Afzaal
Muhammet F. Demir
Hao Lu
Padageshwar Sunkara
Metin N. Gurcan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Large language models (LLMs) can automate clinical document summary generation. However, even clinically accurate outputs often fail to reflect individual clinicians’ writing styles, leading to substantial post-editing. We examine this stylistic gap using a multi-author corpus of de-identified clinical summaries. We propose a style-informed generation framework that extracts clinician-specific stylistic features through LLM feedback and applies a Train→Generate paradigm to produce personalized clinical summaries. Conventional metrics (ROUGE, BERTScore, cosine similarity) largely failed to distinguish intra-author from inter-author writing patterns, while Jaro-Winkler and BLEU demonstrated limited sensitivity. Targeted LLM-guided feature extraction—emphasizing rhythm, narration, and sentence or list structure—improved authorship classification up to 73% of accuracy. In blinded clinician A/B testing, GPT-4-generated drafts were preferred less often than original notes, whereas the Gemini 2.5 Pro pipeline produced drafts preferred at rates comparable to, and in some cases exceeding, clinician-authored summaries. While inherent hallucination risks were noted, they were mitigated via high-fidelity prompt engineering and explicit adherence to source-only data constraints. These results suggest that style-informed generation can reduce the style gap and produce clinically acceptable clinical summaries that better align with the clinician’s voice.

Version published to 10.21203/rs.3.rs-9054955/v1 on Research Square
Mar 26, 2026

Large Language Models and Shifts in Scholarly Writing Style: A Cross-Journal Quantitative Analysis of Ophthalmology Research Articles

This article has 3 authors:
1. Tom Kornhauser
2. Tolossa Tufa Regassa
3. Morris E. Hartstein
This article has no evaluationsLatest version Mar 31, 2026
Language Twin: A Shared-State Architecture for Terminology-Consistent Document Translation with Human-Edit Propagation: A Pilot Study

This article has 1 author:
1. Elliott SeokHyun Ahn
This article has no evaluationsLatest version Apr 17, 2026
Augmenting Large Language Models with External Data Sources: A Systematic Review of Methodologies, Performance Metrics, and Information Fidelity

This article has 4 authors:
1. Soham Mukherjee
2. John Le
3. Chau Nguyen
4. Thai Vu
This article has no evaluationsLatest version Apr 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Large Language Models and Shifts in Scholarly Writing Style: A Cross-Journal Quantitative Analysis of Ophthalmology Research Articles

Language Twin: A Shared-State Architecture for Terminology-Consistent Document Translation with Human-Edit Propagation: A Pilot Study

Augmenting Large Language Models with External Data Sources: A Systematic Review of Methodologies, Performance Metrics, and Information Fidelity