Multi-task artificial intelligence annotation of echocardiographic images: a retrospective multi-cohort study

Yuki Sahashi
David Choi
Hirotaka Ieki
Milos Vukadinovic
Meenal Rawlani
Bryan He
Alan C. Kwan
Susan Cheng
David Ouyang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

A comprehensive transthoracic echocardiogram involves the assessment of over 70 parameters, placing a substantial burden on sonographers and physicians for manual annotation with considerable inter-observer variability. Prior open-source segmentation models have largely addressed 2D B-mode ventricular function, leaving a gap in the spectral Doppler and atrial measurements required for valvular and diastolic assessment such as velocity-time integral (VTI) and atrial chamber size.

Methods

In this retrospective multi-cohort study, we developed EchoNet-Segmentation, comprehensive task-specific deep learning segmentation models for left and right atrial area and VTI Doppler measurements. Training used 186,712 sonographer-annotated images from 93,978 studies (56,855 patients) at Cedars-Sinai Medical Center (CSMC). Performance was evaluated on a held-out CSMC test set, a CSMC temporal split, an external Kaiser Permanente Northern California cohort, and the public MIMIC-Echo dataset.

Findings

On the CSMC held-out test set, our AI models showed strong agreement with sonographer measurements, with R² of 0.817–0.882 and mean absolute error (MAE) of 1.13–3.80 cm for automated VTI measurements, and R² of 0.675–0.747 and MAE of 2.48–2.52 cm² for left and right atrial area segmentation. Performance was consistently confirmed on the CSMC temporal split (VTI: R² 0.606–0.866, atrial area: R² 0.694–0.705) and on the KPNC external cohort (VTI: R² 0.575–0.859, atrial area: R² 0.803–0.876), on the MIMIC-Echo dataset. Robustness was demonstrated on a different vendor’s machines and across subgroups. EchoNet-Segmentation outperformed an open-source medical image foundation model with bounding-box, point prompt configurations on R², MAE, and Dice score on both held-out test dataset and MIMIC apical four-chamber data.

Interpretation

EchoNet-Segmentation is the first open-source framework that delivers accurate, generalizable automated measurement across several key routine echocardiographic parameters, supporting end-to-end automation of clinically important echocardiographic assessments. Public release of model weights, code, and demonstration tools can facilitate reproducibility, research use and clinical deployment.

Funding

Funding Statement: This work was supported by NIH NHLBI grants R00HL157421, R01HL173526, and R01HL173487 to D.O.

Research in context

Evidence before this study

We searched PubMed and arXiv from database on April 1, 2026, for studies of deep learning-based segmentation of echocardiographic images, using the terms (“echocardiography” OR “echocardiogram”) AND (“deep learning” OR “artificial intelligence”) AND (“segmentation” OR “measurement”). Prior work has demonstrated automated segmentation of cardiac chambers and left ventricular ejection fraction estimation, and a small number of studies have reported deep learning models for velocity-time integral (VTI) or atrial size measurement. However, openly available models and code remain largely restricted to left ventricular structures, ejection fraction, and wall thickness, and commercial tools remain proprietary. To our knowledge, no open-source framework has comprehensively addressed VTI measurements across multiple Doppler views together with atrial chamber size in a single, reproducible toolkit, and existing models have not been systematically benchmarked against general-purpose medical-image foundation models on echocardiographic tasks.

Added value of this study

We developed and validated EchoNet-Segmentation, a suite of task-specific deep learning models for several clinically important echocardiographic parameters: left and right atrial area and five VTI measurements (aortic valve, mitral valve, left ventricular outflow tract, right ventricular outflow tract, and pulmonary valve). The models were trained on the largest real-world collection of sonographer-annotated echocardiograms reported to date (186,712 images from 56,855 patients) in an academic center in the United States and showed strong agreement with sonographer measurements on a held-out internal test set, a temporal split cohort, an external cohort from a different health system, and a publicly available cohort recorded on a different vendor’s ultrasound machines. EchoNet-Segmentation outperformed the publicly released medical-image foundation model (MedSAM2) on cardiac chamber segmentation across both internal and public dataset benchmarks. All model weights, training and inference code, demonstration tools, and the manual segmentation masks used for the public benchmark are openly released.

Implications of all the available evidence

EchoNet-Segmentation enables end-to-end automation of routine transthoracic echocardiographic measurements with previously released open-source models. By openly releasing model weights, training code, and benchmark data, this work provides a reproducible foundation that the broader research and clinical community can build on, fine-tune for specific populations or imaging protocols, and integrate into clinical workflows. Prospective validation and randomized studies will be needed to define the impact of automated measurement on diagnostic accuracy, workflow efficiency, and clinical outcomes.

Version published to 10.64898/2026.06.23.26356383 on medRxiv
Jun 25, 2026

Artificial Intelligence-Enabled Cardiac Function Estimation from Phone Videos of Echocardiograms

This article has 12 authors:
1. Dhawal Modi
2. Jay Kim
3. Alexander Ye
4. Sahir Eusuff
5. Hirotaka Ieki
6. Andrew P. Ambrosy
7. Alan C. Kwan
8. Bryan He
9. James Zou
10. Susan Cheng
11. Euan Ashley
12. David Ouyang
This article has no evaluationsLatest version Jun 22, 2026
An ECG foundation model for generalizable cardiac function prediction across the lifespan

This article has 5 authors:
1. Yuting Yang
2. Lorenzo Peracchio
3. Joshua Mayourian
4. Timothy Miller
5. William G. La Cava
This article has no evaluationsLatest version May 27, 2026
CarotidMamba: Foundation Model–Enabled CTA Phenotyping of Symptomatic Carotid Plaques in a Multi-Center Retrospective Study

This article has 16 authors:
1. Yong-Sheng Liu
2. Xin-Wei Dou
3. Peng-Yu Zheng
4. Wang Feng
5. Liu-Jie Ma
6. Ying-Ning You
7. Gui-Wen Shao
8. Jia-Geng Shen
9. Xin Yu
10. Chen Qiao
11. Zi-Wei Cheng
12. Zhong-Wen Li
13. Feng Su
14. Bo-Wen Zhang
15. Xing-Huang Qu
16. Gui-Nan Jiang
This article has no evaluationsLatest version Jun 5, 2026