HEART: Hierarchical ensemble model using augmented representations and tabular learning for coronary artery disease prediction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Coronary Artery Disease (CAD) remains one of the most widespread and life-threatening cardiovascular diseases, ranking among the leading causes of mortality around the world. The high prevalence of CAD highlights the urgent need for effective early detection methods, but its diagnosis often relies on invasive or imperfect screening tools that delay intervention and increase risk. To address this challenge, we introduce HEART, a novel machine learning framework that combines structured clinical knowledge with advanced ensemble learning and data-centric augmentation to enhance early CAD prediction. HEART is a two-level ensemble model, where nine diverse models act as base learners. These include Logistic Regression, Elastic Net Regression, Support Vector Machine, K-Nearest Neighbors, Radius Neighbors, Extra Trees, LightGBM, TabNet and TabPFN. Their predictions are combined by a TabPFN meta-learner that captures complex interactions among model outputs. We use Mutual Information (MI) for feature selection and to address class imbalance and limited data, we use a hybrid augmentation strategy that combines synthetic minority oversampling technique (SMOTE) with class-specific Autoencoder reconstructions. Our study, evaluated in the Sani Z-Alizadeh dataset, increases the dataset to a final set of 1,000 samples and demonstrates that HEART achieves a top accuracy of 91\% under fully nested stratified ten-fold cross-validation compared to the other nine distinct classifiers.

Article activity feed