Multi-omics integration and batch correction using a modality-agnostic deep learning framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
State-of-the-art biotechnologies allow the detection of different molecular species on the same biological sample, generating complex highly-dimensional multi-modal datasets. Gaining a holistic understanding of biological phenomena, such as oncogenesis or aging, requires integrating these diverse modalities into low-dimensional data representations while correcting for technical artifacts. Here we present MIMA, a modular, unsupervised AI framework for multi-omics data integration and batch correction. Applied to complex spatial and single-cell datasets, MIMA effectively removes batch effects, while preserving biologically relevant information, and learns representations predictive of expert pathologist annotations. Additionally, it enables cross-modal translation, uncovers molecular patterns not captured by manual annotations, and despite being modality-agnostic performs on par with specialized state-of-the-art tools. MIMA’s flexibility and scalability make it a powerful tool for multimodal data analysis. MIMA provides a foundation for AI-based, multi-omics augmented digital pathology frameworks, offering new opportunities for improved patient stratification and precision medicine through the comprehensive integration of high-dimensional molecular data and histopathological imaging.