Moving Beyond Binary Biomarkers: Machine Learning Model Resolves Concurrent and Molecularly Heterogeneous Mismatch Repair and Homologous Recombination Deficiencies in Prostate Cancer
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Current DNA damage repair (DDR) biomarkers employ binary classifications that fail to capture the molecular complexity of tumors with concurrent repair deficiencies. We used genomics analysis to stratify 672 metastatic prostate cancer patients into 11 DDR subgroups, identifying 51 molecular signatures with weighted roles in class identity. We identified a tumor-mutational-burden very-high subset, characterized by 19 mutations/Mb or more, as a molecularly distinct group characterized by preserved genomic integrity and enhanced immunogenicity. Critically, 2.3 percent of tumors exhibited concurrent TMB-High and HRR mutant phenotypes, while 1.5 percent harbored MMR bi-allelic loss without MMRd (mismatch-repair-deficiency) signatures. Clinical validation in 130 patients demonstrated superior immunotherapy responses in tumors with very high TMB levels. We developed CHIMERA DDR, a probabilistic machine learning tool that integrates these 51 genomic features using a nested Random Forest architecture to infer seven clinically relevant DDR subgroups. After negating model overfit concerns, CHIMERA-DDR showed exceptional classification performance (AUCs 0.919-0.999) to accurately detect MMRd and HRR mutant molecular subtypes with or without concurrent DDR deficiencies, resolving admixed phenotypes to enable precision therapeutic stratification beyond binary methods.