MyGESig: a population-specific gene signature improves survival prediction in Malaysian breast cancer patients

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate prognostic models are essential for guiding treatment decisions and improving patient outcomes in breast cancer. To achieve this, population-specific models are needed to account for genetic, clinical, and pathological differences across populations. In this study, the widely used and freely available PREDICT v3.0 breast cancer prognostic model was first validated in the multiethnic Malaysian Breast Cancer (MyBrCa) cohort to assess its performance. Given its only moderate performance in this population, a machine learning workflow was developed to integrate gene expression and clinical information for classifying patients by their 10-year prognosis. A 77-gene signature, termed MyGESig, was derived from the transcriptomes of 258 MyBrCa patients. Using this signature in combination with clinical variables, an ensemble-based model achieved a median area under the receiver-operator characteristic curve (AUROC) of 0.92 in the hold-out testing set and 0.90 in the independent MyBrCa dataset. While the model exhibited poor generalizability in external cohorts, its discriminative performance improved when trained and tested within the same population (median AUROC: 0.71 in METABRIC; 0.84 in SCAN-B), validating the prognostic value of the gene set. Together, these findings demonstrate the value of incorporating population-specific gene expression datasets into prognosis prediction and highlight the need to develop and validate models tailored to diverse populations in breast cancer.

Article activity feed