An Artificial Intelligence Model for Detection of Heart Failure with Preserved Ejection Fraction: A Report from HeartShare Study
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Heart failure with preserved ejection fraction (HFpEF) accounts for over half of all heart failure cases in the United States and remains a diagnostic challenge. Non-invasive, scalable screening tools may enable earlier recognition, timely intervention, and improved care. To evaluate the performance, reproducibility, and early detection capability of an electrocardiogram-based artificial intelligence (ECG-AI) model designed to identify HFpEF using HeartShare data and real-world ECGs from Wake Forest Baptist Health (WFBH).
Methods
The original ECG-AI model was developed and validated using >1 million ECGs. In this study, we examined the external validity and reproducibility over time of this ECG-AI measure in 432 participants from an NIH-funded study of clinically validated HFpEF or controls (HeartShare). Specifically, we assessed model accuracy (AUC, sensitivity, specificity, predictive values) and reproducibility across three serial ECGs. We also analyzed the potential for early (preclinical) detection of HFpEF in 59,705 real-world ECGs from 12,338 patients a large integrated healthcare system (Wake Forest Baptist Health (WFBH)).
Results
In HeartShare, ECG-AI achieved an AUC of 0.760 (95% CI: 0.729-0.816), with 65% sensitivity and 75% specificity for detection of HFpEF. We obtained no significantly different AUC when using only lead I ECG as an input, AUC of 0.773 (0.729-0.816). Within-patient reproducibility across three consecutive ECGs showed strong correlations (Pearson r = 0.87-0.89) and strong agreement (Cohen’s κ = 0.68-0.74). Misclassified cases showed fewer risk factors and more normal-like ECG features. In real-world WFBH data, ECG-AI detected HFpEF up to 4 years before clinical diagnosis with AUCs from 0.77 to 0.80.
Conclusions
12 lead ECG-AI model demonstrates strong generalizability, reproducibility and early detection capabilities for HFpEF, supporting its potential as a scalable screening and risk stratification tool. Almost identical single lead AUC demands future investigation for remote monitoring.
What is new?
This study is the first to demonstrate that an ECG-AI model for HFpEF maintains strong temporal reproducibility across serial ECGs, supporting its stability as a robust non-invasive tool. We validate the model in both a rigorously phenotyped cohort and a large real-world health system and show that single-lead ECG input achieves accuracy comparable to the full 12-lead model. In addition, we show that the model can identify HFpEF years before clinical diagnosis, extending prior work by establishing ECG-AI as a reproducible, generalizable, and potentially preclinical detection tool.
What are the clinical implications?
The strong temporal reproducibility of the ECG-AI measure indicates that it can provide reliable longitudinal tracking of HFpEF risk, making it suitable for both clinical monitoring and remote assessment. Early detection capabilities - up to four years before diagnosis - create opportunities for proactive evaluation and earlier intervention. The comparable performance of single-lead ECGs also opens the door for scalable deployment through wearable or home-based devices, broadening access to HFpEF screening and enabling continuous risk surveillance outside traditional clinical environments.