Multi-task artificial intelligence annotation of echocardiographic images: a retrospective multi-cohort study
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
A comprehensive transthoracic echocardiogram involves the assessment of over 70 parameters, placing a substantial burden on sonographers and physicians for manual annotation with considerable inter-observer variability. Prior open-source segmentation models have largely addressed 2D B-mode ventricular function, leaving a gap in the spectral Doppler and atrial measurements required for valvular and diastolic assessment such as velocity-time integral (VTI) and atrial chamber size.
Methods
In this retrospective multi-cohort study, we developed EchoNet-Segmentation, comprehensive task-specific deep learning segmentation models for left and right atrial area and VTI Doppler measurements. Training used 186,712 sonographer-annotated images from 93,978 studies (56,855 patients) at Cedars-Sinai Medical Center (CSMC). Performance was evaluated on a held-out CSMC test set, a CSMC temporal split, an external Kaiser Permanente Northern California cohort, and the public MIMIC-Echo dataset.
Findings
On the CSMC held-out test set, our AI models showed strong agreement with sonographer measurements, with R² of 0.817–0.882 and mean absolute error (MAE) of 1.13–3.80 cm for automated VTI measurements, and R² of 0.675–0.747 and MAE of 2.48–2.52 cm² for left and right atrial area segmentation. Performance was consistently confirmed on the CSMC temporal split (VTI: R² 0.606–0.866, atrial area: R² 0.694–0.705) and on the KPNC external cohort (VTI: R² 0.575–0.859, atrial area: R² 0.803–0.876), on the MIMIC-Echo dataset. Robustness was demonstrated on a different vendor’s machines and across subgroups. EchoNet-Segmentation outperformed an open-source medical image foundation model with bounding-box, point prompt configurations on R², MAE, and Dice score on both held-out test dataset and MIMIC apical four-chamber data.
Interpretation
EchoNet-Segmentation is the first open-source framework that delivers accurate, generalizable automated measurement across several key routine echocardiographic parameters, supporting end-to-end automation of clinically important echocardiographic assessments. Public release of model weights, code, and demonstration tools can facilitate reproducibility, research use and clinical deployment.
Funding
Funding Statement: This work was supported by NIH NHLBI grants R00HL157421, R01HL173526, and R01HL173487 to D.O.
Research in context
Evidence before this study
We searched PubMed and arXiv from database on April 1, 2026, for studies of deep learning-based segmentation of echocardiographic images, using the terms (“echocardiography” OR “echocardiogram”) AND (“deep learning” OR “artificial intelligence”) AND (“segmentation” OR “measurement”). Prior work has demonstrated automated segmentation of cardiac chambers and left ventricular ejection fraction estimation, and a small number of studies have reported deep learning models for velocity-time integral (VTI) or atrial size measurement. However, openly available models and code remain largely restricted to left ventricular structures, ejection fraction, and wall thickness, and commercial tools remain proprietary. To our knowledge, no open-source framework has comprehensively addressed VTI measurements across multiple Doppler views together with atrial chamber size in a single, reproducible toolkit, and existing models have not been systematically benchmarked against general-purpose medical-image foundation models on echocardiographic tasks.
Added value of this study
We developed and validated EchoNet-Segmentation, a suite of task-specific deep learning models for several clinically important echocardiographic parameters: left and right atrial area and five VTI measurements (aortic valve, mitral valve, left ventricular outflow tract, right ventricular outflow tract, and pulmonary valve). The models were trained on the largest real-world collection of sonographer-annotated echocardiograms reported to date (186,712 images from 56,855 patients) in an academic center in the United States and showed strong agreement with sonographer measurements on a held-out internal test set, a temporal split cohort, an external cohort from a different health system, and a publicly available cohort recorded on a different vendor’s ultrasound machines. EchoNet-Segmentation outperformed the publicly released medical-image foundation model (MedSAM2) on cardiac chamber segmentation across both internal and public dataset benchmarks. All model weights, training and inference code, demonstration tools, and the manual segmentation masks used for the public benchmark are openly released.
Implications of all the available evidence
EchoNet-Segmentation enables end-to-end automation of routine transthoracic echocardiographic measurements with previously released open-source models. By openly releasing model weights, training code, and benchmark data, this work provides a reproducible foundation that the broader research and clinical community can build on, fine-tune for specific populations or imaging protocols, and integrate into clinical workflows. Prospective validation and randomized studies will be needed to define the impact of automated measurement on diagnostic accuracy, workflow efficiency, and clinical outcomes.