Advancing cardiovascular disease risk prediction beyond conventional methods: a systematic review of multimodal machine learning models integrating traditional clinical factors and multi-omics data

Lyns Etienne
Pierre Bauvin
Alaedine Benani
Maryne Lepoittevin
Sylvain Bodard

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Cardiovascular disease (CVD) is a leading global health burden. Traditional risk prediction models, though widely used, often overlook genetic predisposition and other complex biological factors, which significantly impacts CVD risk. The emergence of multi-omics technologies now enables a more comprehensive view of an individual’s risk, but integrating such high-dimensional data has been challenging and requires advanced computational approaches. Recent advances in machine learning methods now offer powerful tools to synthesize and integrate these high-throughput dataset, offering a promising approach to improve CVD risk stratification.

Objective

This systematic review assesses whether CVD risk prediction models incorporating omics data alongside clinical and other variables improve prediction compared to using clinical or omics data alone.

Methods

A systematic search was conducted across PubMed (MEDLINE), Embase, and Web of Science databases in June 2025 using keywords related to CVD, risk prediction, multi-omics data, and machine learning. Studies reporting on models comparing multi-omics data with traditional clinical factors for CVD risk prediction were included. Data on model performance, methodologies, and subgroup analyses were extracted and synthesized.

Results

Studies consistently showed that clinical models integrating multiple modalities, including approximately genomic (n=58), biomarkers (n=109), biological (n=125), and other data types significantly enhanced CVD risk prediction, with combined clinical+genomic models outperforming single-modality approaches. Other data types like lifestyle factors and proteomics further refined performance. Subgroup analyses revealed decreased predictor accuracy across diverse ancestries and age-specific performance differences. Importantly, genetically defined high-risk individuals often derived greater absolute benefits from targeted clinical interventions. Models effectively spanned from predicting risk in asymptomatic individuals for primary prevention to guiding prognosis in diseased patients for secondary prevention

Conclusion

CVD risk prediction models integrating genomic, clinical, and other variables offer superior accuracy and refined stratification. These advanced models hold immense potential for personalized interventions across diverse populations. Future research should prioritize real-world implementation and broad validation to translate these findings into routine clinical practice.

Version published to 10.1101/2025.10.07.25337473 on medRxiv
Oct 8, 2025

Machine Learning Insights for Cardiovascular Risk Prediction in Diabetic Patients: Emphasis on Renal and Cardiac Markers Using Random Forests

This article has 1 author:
1. Julian Borges
This article has no evaluationsLatest version Jan 21, 2026
A Unified Framework for Survival Prediction: Combining Machine Learning Feature Selection with Traditional Survival Analysis in Heart Failure and METABRIC Breast Cancer

This article has 7 authors:
1. Fangya Tan
2. Jian-Guo Zhou
3. Shuqiao Li
4. Bowen Long
5. Srikar Bellur
6. Yang Zhou
7. Mark Newman
This article has no evaluationsLatest version Jan 29, 2026
Benchmarking Ensemble Machine Learning Algorithms for the Early Prediction of Stroke in Imbalanced Clinical Cohorts: A Comparative Analysis and Decision Curve Assessment

This article has 2 authors:
1. Ibrahim Ibrahim Shuaibu
2. Yousaf Hussain
This article has no evaluationsLatest version Jan 22, 2026

Discuss this preprint

Listed in

Abstract

Background

Objective

Methods

Results

Conclusion

Article activity feed

Related articles

Machine Learning Insights for Cardiovascular Risk Prediction in Diabetic Patients: Emphasis on Renal and Cardiac Markers Using Random Forests

A Unified Framework for Survival Prediction: Combining Machine Learning Feature Selection with Traditional Survival Analysis in Heart Failure and METABRIC Breast Cancer

Benchmarking Ensemble Machine Learning Algorithms for the Early Prediction of Stroke in Imbalanced Clinical Cohorts: A Comparative Analysis and Decision Curve Assessment