MetaSage: Machine Learning-Based Prioritization of Metabolic Regulators from Multi-Omics Data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Dysregulation of metabolites is a hallmark of cancer, yet the underlying regulatory mechanisms remain poorly understood. To systematically explore metabolic regulation across cancers, we developed an XGBoost-based machine learning pipeline, MetaSage, that integrates context-agnostic knowledge graph with multi-omics datasets. Using harmonized data from 15 cohorts spanning 11 cancer types, we identified 442 variable metabolites and found that both genes and upstream metabolites showed comparable regulatory influence. Predictable metabolites, defined by a significant correlation between predicted and measured levels, were identified using our pipeline and varied widely across cohorts-partially due to the batch effect. For each predictable metabolite, key regulatory features were determined using Shapley values. This yielded 1,146 gene features and 363 precursor metabolites as important regulators. Network analysis of 22 recurrent metabolites revealed a mix of conserved and cancer type-specific regulatory patterns. Our framework enables robust discovery of metabolite regulation and therapeutic insights in cancer.

Article activity feed