Fine-Tuning LLaMA2 for Summarizing Discharge Notes: Evaluating the Role of Highlighted Information

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Purpose: This study investigates whether incorporating highlighted information in discharge notes improves the quality of the summaries generated by Large Language Models (LLMs). Specifically, it evaluates the effect of using highlighted versus unhighlighted inputs for fine-tuning LLaMA2-13B model for the summarization task. Methods: We fine-tuned the LlaMA2-13B model in two variants using MIMIC-IV-Ext-BHC dataset: one variant fine-tuned with the highlighted discharge notes (H-LLaMA), and the other variant on the same set of notes without highlighting (U-LLaMA). Highlighting was performed automatically using a Cardiology Interface Terminology (CIT) presented in our previous work. H-LLaMA and U-LLaMA were evaluated on a randomly selected test set of 100 discharge notes using multiple metrics (including BERTScore, ROUGE-L, BLEU, and SummaC_CONV). Additionally, LLM-based judgment via ChatGPT-4o was used to rate coherence, fluency, conciseness, and correctness, alongside a manual completeness evaluation on a random sample of 20 notes. Results: H-LLaMA consistently outperformed U-LLaMA across all metrics. H-summaries, generated using H-LLaMA, in comparison to U-summaries, generated using U-LLaMA, achieved higher BERTScore (63.75 vs. 59.61), ROUGE-L (23.43 vs. 21.82), BLEU (10.4 vs. 8.41), and SummaC_CONV (67.7 vs. 40.2). Manual review also showed improved completeness for H-summaries (54.2% vs. 48.1%). All improvements were statistically significant (p < 0.05). Moreover, LLM-based evaluation indicated higher average ratings across coherence, correctness, and conciseness. Conclusion: Incorporating highlighted information into discharge notes for fine-tuning LLMs enhances the summarization quality. This approach provides a scalable method for improving discharge note summarization and has the potential to support better clinical decision-making through more informative and reliable summaries.

Article activity feed