Breast Cancer Biomarker Discovery Using Explainable Machine Learning on Expression-Level Signatures

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

0.1 Background. Identifying biomarkers associated with the broad heterogeneity of breast cancer (BC) is crucial for advancing patient management and improving treatment outcomes. However, prior BC signal transduction network (STN) activity research generalises gene expression and primarily focuses on a limited subset of BC STNs and network-genes of interest, overlooking expressionlevel dynamics, and the overlapping and interconnected nature of BC STNs. To alleviate this gap, we propose a two-fold state-of-the-art approach that explicitly captures the expression modulation patterns among patients. We analyse all major, known BC pathology STNs simultaneously, and apply explainable artificial intelligence (XAI) post-survival analyses to identify novel biomarkers. 0.2 Results. Using XAI importance scoring, we identified 28 biomarkers ranked by their relevance to BC prognosis. To validate our findings, we searched literature and databases for independent evidence on the roles of a sample of these genes in BC. For instance, Dickkopf-related protein 4 (DKK4) and kringle-containing transmembrane protein 1 (KREMEN1) protein families are known to cooperate in negatively regulating the Wnt/β-catenin pathway, which is linked to cell proliferation and migration. Among the top-ranked biomarkers, these two genes are comparatively less explored in BC, underscoring their prognostic potential and the need for further investigation. 0.3 Conclusions. We show that integrating more STNs diversity with XAIbased patient outcome modelling can uncover previously overlooked or novel biomarkers, providing deeper insights into BC complexity. Moreover, triaging into the expression modulation patterns provides additional insights for population dynamics and targeted therapy. With additional training data and real-world evaluations, these findings promise to deliver more tailored treatment options, ultimately improving patient outcomes.

Article activity feed