An Open-Source Retrospective Analysis of Hypertrophic and Dilated Cardiomyopathy Using Machine Learning and Electrocardiogram Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background/Objectives: Dilated (DCM) and hypertrophic cardiomyopathy (HCM) are common cardiomyopathies associated with heart failure. Electrocardiogram (ECG) screening before an echocardiogram could help streamline diagnosis, particularly in rural areas. Prior ECG machine-learning (ML) studies do not use open-source data when studying cardiomyopathy, and very few proprietary studies directly compare HCM and DCM or address ECG differences within obstructive (HOCM) and non-obstructive HCM (HNCM). Methods: Standard and vectorcardiogram-derived (VCG) ECG features were extracted from the MIMIC-IV-ECG database. The final cohort comprised 599 patients (HCM = 208 [HOCM = 99, HNCM = 53, unknown = 56], DCM = 391 [ischemic cardio-myopathy with left ventricular dilation = 250, non-ischemic = 141]). Logistic regression (LR) and extreme gradient boosting (XGBoost) with five-fold cross-validation separated HCM from ischemic cardiomyopathy with left ventricular dilation (DCM-I) and non-ischemic DCM (DCM-NI), and HOCM from HNCM. Results: Using the area under the receiver operating characteristic curve (AUC-ROC) as the performance metric, LR achieved high discrimination of HCM from DCM-I (0.92) and DCM-NI (0.90). However, differentiating HOCM from HNCM proved more difficult (XGBoost = 0.81; LR = 0.75). Both DCM subtypes (especially ischemic) showed lower QRS amplitudes and right-posterior ventricular gradient orientation; HCM displayed higher amplitudes and larger, more complex T-loops. Within HCM, HOCM had stronger leftward electrical activity and more dipolar to non-dipolar QRS energy after singular value decomposition. Conclusions: Using only open-access data, we demonstrate an interpretable ECG-based pipeline that discriminates cardiomyopathy and highlights distinct features. While detecting ob-struction remains difficult, ECG features provide measurable separation, supporting possible diagnostic screening and offering a reproducible framework for future studies.

Article activity feed