Machine learning to classify left ventricular hypertrophy using ECG feature extraction by variational autoencoder

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Traditional ECG criteria for left ventricular hypertrophy (LVH) have modest diagnostic yield.

Objective

Develop and validate machine learning models for LVH diagnosis from ECG.

Methods

ECG summary features (rate, intervals, axis), R-wave, S-wave and overall-QRS amplitudes, and QRS voltage-time integrals (VTI QRS ) were extracted from 12-lead, vectorcardiographic X-Y-Z-lead, and 3D (L2 norm) representative-beat ECGs. Latent features (30 per ECG) were extracted using a variational autoencoder (trained on unselected >1 million ECGs) from X-Y-Z-lead representative-beat ECG signals. Logistic regression, random forest, light gradient boosted machine (LGBM), residual network (ResNet) and multilayer perceptron network (MLP) models using ECG features and sex, and a convolutional neural network (CNN) using ECG signals alone, were trained to predict LVH (left ventricular mass indexed in women >95 g/m 2 , men >115 g/m 2 ) on 482,734 adult ECG-echocardiogram (within 45 days) pairs. ROC-AUCs for LVH classification are reported from a separate hold-out test set.

Results

In the test set (n=54,984), AUC for LVH classification was higher for ML models using ECG features (LGBM 0.794, MLP 0.793, ResNet 0.795) compared with the best individual ECG variable (VTI QRS-Z 0.707), the best traditional criterion (Cornell voltage-duration product 0.716), and the CNN using ECG signals (0.788). Among patients without LVH who had a follow-up echocardiogram >1 (closest to 5) year later, LGBM false positives, compared to true negatives, had a 3.07 (95% CI 2.44, 3.86)-fold higher odds of developing future LVH (p<0.0001).

Conclusions

ML models are superior to traditional ECG criteria to classify LVH. Models trained on extracted ECG features, including latent variational autoencoder representations, can outperform CNN models directly trained on ECG signals.

Article activity feed