Interventionally-guided representation learning for robust and interpretable AI models in cancer medicine

Dom Kirkham
Riccardo Masina
Stephen-John Sammut
Sach Mukherjee
Oscar M. Rueda

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Machine learning models in cancer medicine are powerful but can suffer from a lack of robustness and interpretability that limits their medical and scientific utility. Here, we introduce a new class of models for high-dimensional molecular data that is guided by interventional auxiliary information to learn latent representations that are informative, interpretable-by-design and, as we show, allow for strong generalization to new data distributions and settings. Specifically, the proposed models use causal information from genetic loss-of-function screens to guide learning and thereby discover representations that are generally useful, even far beyond the original training context. Empirical results using cancer cell line datasets support the hypothesis that causal guidance permits strong generalization and we find that the models can be applied in an entirely “zero-shot” fashion—with no fine-tuning or adaptation whatsoever—to entire cancer types that were not encountered during training. Using several large patient cohorts, we show that this strong generalization applies also in the “bench-to-bedside” context of transferring cell line-based models to clinical data. Here, we find that our models, with all representation learning steps trained only on cell line data, can be efficiently transferred to patient data, hence opening up new ways to drive clinical modelling using tractable laboratory assays. Our results provide a new and very general way of leveraging causal and interventional information to construct data-efficient and robust AI-based predictive models for molecular medicine.

Version published to 10.1101/2025.07.21.662350 on bioRxiv
Jul 21, 2025

WITHDRAWN: Using Machine Learning to Improve Cancer Diagnosis Accuracy Through Genetic Data Analysis

This article has 4 authors:
1. Bassam Elzaghmouri
2. Marwan Abo zanoneh
3. Feras Fares AL-Mashakbah
4. Saad Mamoun AbdelRahman Ahmed
This article has no evaluationsLatest version Jul 18, 2025
Assumption-Agnostic Deep Learning Framework for Holistic Clinical Trial Monitoring

This article has 3 authors:
1. Shaoming Yin
2. Zheyang Wu
3. Jianchang Lin
This article has no evaluationsLatest version Aug 1, 2025
Scalable and Interpretable Mixture of Experts Models in Machine Learning: Foundations, Applications, and Challenges

This article has 3 authors:
1. Rajab Jafar
2. Fawzi Gamal
3. Rais Raheem
This article has no evaluationsLatest version Jul 3, 2025

Listed in

Abstract

Article activity feed

Related articles

WITHDRAWN: Using Machine Learning to Improve Cancer Diagnosis Accuracy Through Genetic Data Analysis

Assumption-Agnostic Deep Learning Framework for Holistic Clinical Trial Monitoring

Scalable and Interpretable Mixture of Experts Models in Machine Learning: Foundations, Applications, and Challenges