Clinician Evaluation of Artificial Intelligence Summaries of Pediatric CVICU Progress Notes

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Effective communication in critical care units, such as the CVICU, is vital for patient safety, but clinical notes from multiple professionals are often lengthy and complex. This study evaluated the Mistral Large Language Model for summarizing progress notes from the Cardiovascular Intensive Care Unit using the I-PASS framework for structured communication. A total of 385 progress notes were combined for each patient and summarized by the model. The readability was assessed using multiple metrics, including Flesch Reading Ease, Flesch-Kincaid Grade Level, Gunning-Fog Index, SMOG Index, Automated Readability Index, and Dale-Chall Score, and cosine similarity was used to measure alignment with the original notes. The AI summaries were harder to read, with a Flesch Reading Ease score of 29.25 compared to 56.89 for the original notes, and required a higher reading level—Grade 15.24 for the summaries versus Grade 8.98 for the original notes. A cosine similarity of 0.6 showed moderate alignment, retaining key details but losing some context in the generated summaries. Mistral effectively condensed the notes, but readability suffered as a result. Future work will aim to improve clarity and preserve key clinical details through human-guided evaluation.

Article activity feed