Assessing Genotype-Phenotype Correlations with Deep Learning in Colorectal Cancer: A Multi-Centric Study

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Deep Learning (DL) has emerged as a powerful tool to predict genetic biomarkers directly from digitized Hematoxylin and Eosin (H&E) slides in colorectal cancer (CRC). However, few studies have systematically investigated the predictability of biomarkers beyond routinely available alterations such as microsatellite instability (MSI), and BRAF and KRAS mutations.

Methods

Our primary dataset comprised H&E slides of CRC tumors across five cohorts totaling 1,376 patients who underwent comprehensive panel sequencing, with an additional 536 patients from two public datasets for validation. We developed a DL model using a single transformer model to predict multiple genetic alterations directly from the slides. The model’s performance was compared against conventional single-target models, and potential confounders were analyzed.

Findings

The multi-target model was able to predict numerous biomarkers from pathology slides, matching and partly exceeding single-target transformers. The Area Under the Receiver Operating Characteristic curve (AUROC, mean ± std) on the primary external validation cohorts was: BRAF (0·78 ± 0·01), hypermutation (0·88 ± 0·01), MSI (0·93 ± 0·01), RNF43 (0·86 ± 0·01); this biomarker predictability was mirrored across metrics and co-occurrence analyses. However, biomarkers with high AUROCs largely correlated with MSI, with model predictions depending considerably on MSI-associated morphology upon pathological examination.

Interpretation

Our study demonstrates that multi-target transformers can predict the biomarker status for numerous genetic alterations in CRC directly from H&E slides. However, their pre-dictability is mainly associated with MSI phenotype, despite indications of slight biomarker-inherent contributions to a phenotype. Our findings underscore the need to analyze confounders in AI-based oncology biomarkers. To enable this, we developed a validated model applicable to other cancers and larger, diverse datasets.

Funding

The German Federal Ministry of Health, the Max-Eder-Programme of German Cancer Aid, the German Federal Ministry of Education and Research, the German Academic Exchange Service, and the EU.

Article activity feed