EchoGraph: A Specialized Solution for Automatic Echocardiography Report Quality Evaluation

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

With the rise of generative AI, there is a growing need for automatic metrics to evaluate the factual accuracy of clinical text, yet no such tool exists for echocardiography. Existing evaluation tools often underperform in this domain. To address this, we developed EchoGraph, a BERT-based model trained on densely annotated echocardiography reports using a schema tailored for this subspecialty and a dedicated F1-style reward to emphasize clinically important components. EchoGraph demonstrated strong performance in predicting entities (micro F1 0.85) and relations (micro F1 0.70). Its F1 reward was more sensitive to detecting corrupted report content than the RadGraph F1 reward (showing a 50%-60% vs. 6%-12% drop). EchoGraph thus offers an effective solution for evaluating and advancing language model-based applications in echocardiography, supporting the development of more accurate and clinically meaningful AI-generated reports.

Article activity feed