Deep Learning Architectures for Multi-Omics Data Integration: Bridging Biomarker Discovery and Clinical Translation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The integration of heterogeneous molecular data across multiple omics layers is essential for understanding complex disease biology, yet conventional analytical approaches struggle with the high dimensionality, non-linearity, missingness, and technical variability of multi-omics datasets. This review aims to critically evaluate how deep learning (DL) methodologies address these challenges and to assess their translational relevance within biomedical informatics. We systematically review contemporary DL-based approaches for multi-omics data integration and categorize them into two principal methodological strategies: (i) unsupervised latent-space learning models, such as variational autoencoders, which enable probabilistic feature fusion, denoising, and data imputation; and (ii) network-based integration frameworks, including graph neural networks, which incorporate biological prior knowledge through molecular interaction graphs. Supervised extensions, including attention-based models for clinical prediction tasks, are also examined with emphasis on architectural design, data fusion mechanisms, and validation practices. Deep learning-based multi-omics integration methods have demonstrated improved performance over conventional approaches in disease subtyping, biomarker discovery, and prognostic modeling by capturing complex, non-linear interactions across molecular layers. Latent-space models provide robust representations in the presence of incomplete data, while network-based approaches facilitate the identification of biologically coherent molecular subnetworks; however, most reported performance gains rely on internal validation, with limited evidence of external generalizability, interpretability robustness, and readiness for clinical deployment. Deep learning has emerged as a powerful paradigm for multi-omics integration in biomedical informatics, enabling clinically relevant molecular stratification and prediction. Nevertheless, effective translation into routine clinical practice requires a shift beyond predictive accuracy toward explainable modeling, standardized external validation on independent cohorts, and integration with real-world clinical data, which are essential for establishing trustworthy and actionable decision-support systems for precision medicine.

Article activity feed