Characterization of metabolic phenotypes in breast cancer through the integration of genome-scale metabolic models and machine learning

Rigoberto Rincón-Ballesteros
Francisco Javier Álvarez-Padilla
German Preciat

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The metabolic heterogeneity of breast cancer represents a significant challenge for the identification of biomarkers and therapeutic targets. To address this problem, we integrated genome-scale metabolic models with machine learning algorithms, aiming to characterize the metabolic phenotypes associated with the disease.

There were 90 specific metabolic models generated from clinical and gene expression data from the TCGA-BRCA project, including 66 tumor and 24 normal samples. Metabolic fluxes were estimated using gene expression-based optimization, minimizing the weighted L2 norm. Subsequently, the Mann-Whitney test with Benjamini-Hochberg correction was applied to identify the most discriminating reactions.

We evaluated the performance of five classification algorithms (K-Nearest Neighbors, Support Vector Machines, Logistic Regression, Decision Tree, and Naive Bayes) using stratified 5-fold cross-validation. The models effectively differentiated between healthy and cancerous phenotypes, showing good overall performance, although K-Nearest Neighbors and Support Vector Machines stood out with better performance, achieving accuracy values close to 0.98 and a ROC-AUC of 1.00.

Analysis of the differentiable metabolic reactions revealed significant alterations in pathways such as extracellular transport (up to 60 significant reactions), fatty acid oxidation, and nucleotide interconversion. These results highlight the potential of the combined approach of metabolic modeling and machine learning to deepen the understanding of tumor metabolism, although the need for experimental validation and statistical refinement for future studies is emphasized.

Author summary

Given the complex metabolic heterogeneity of breast cancer, which makes it difficult to find effective biomarkers and therapies, we propose a computational approach combining genome-scale metabolic models with machine learning. Using clinical and genetic data from the TCGA-BRCA project, we generated patient-specific models, predicted metabolic fluxes by selecting the most discriminating ones, and evaluated different supervised classification algorithms to distinguish between normal and tumor tissues based on these fluxes.

Our results identify key distinct patterns of breast cancer, highlighting crucial pathways such as extracellular transport and fatty acid oxidation. This study demonstrates the potential of these tools for characterizing tumor metabolism, although we acknowledge that the sample size (n=90) represents a limitation, and future studies with more data are needed to confirm and generalize our findings. Nevertheless, this work represents a step towards the development of more precise and personalized strategies in breast cancer diagnosis or treatment.

Version published to 10.1101/2025.07.29.667003 on bioRxiv
Aug 1, 2025

Multiomics and Machine Learning Identify Prognostic Immune Related Gene Signatures in Ovarian Cancer

This article has 4 authors:
1. Xiulan Wang
2. Xuewang Guo
3. Yanying Xu
4. Shaofang Hua
This article has no evaluationsLatest version Dec 18, 2025
Multi-Omic Integration and Machine Learning Reveal Regulatory Networks Driving Breast Cancer Progression

This article has 2 authors:
1. Unmilita Das Moon
2. Kushal Raj Roy
This article has no evaluationsLatest version Dec 11, 2025
Multi-omics Reveals Metabolic-Inflammatory Drivers of Lung Cancer: An Integrated Mendelian Randomization and Machine Learning Study

This article has 6 authors:
1. Xiongjie Li
2. Fengyue Zhang
3. Xuan Xu
4. Zhenyao Wu
5. Xiaoyan Zhang
6. Xianghui Wang
This article has no evaluationsLatest version Dec 12, 2025

Discuss this preprint

Listed in

Abstract

Author summary

Article activity feed

Related articles

Multiomics and Machine Learning Identify Prognostic Immune Related Gene Signatures in Ovarian Cancer

Multi-Omic Integration and Machine Learning Reveal Regulatory Networks Driving Breast Cancer Progression

Multi-omics Reveals Metabolic-Inflammatory Drivers of Lung Cancer: An Integrated Mendelian Randomization and Machine Learning Study