High Concordance Between GPT-4o and Multidisciplinary Tumor Board Decisions in Breast Cancer: A Retrospective Decision Support Analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Large language models (LLMs) such as ChatGPT have gained attention for their potential to assist clinical decision-making in oncology. However, real-world validation of these models against multidisciplinary tumor board (MTB) recommendations—particularly in breast cancer treatment—remains limited. Methods: This retrospective study assessed the concordance between GPT-4o and the decisions of a breast cancer MTB over a six-month period. Thirty-three patients were included. Structured clinical data were entered into GPT-4o using standardized prompts, and treatment plans were generated in two independent sessions per case. Seven therapeutic domains were evaluated: surgery, radiotherapy, hormonal therapy, neoadjuvant therapy, adjuvant therapy, genetic counseling/testing, and dual HER2-targeted therapy. Two blinded reviewers scored concordance using a 5-point Likert scale. Inter-rater reliability and classification metrics were calculated. Results: GPT-4o generated consistent recommendations across both sessions for all patients. Full concordance (5/5) with MTB decisions was observed in 31 of 33 cases (93.9%), while partial concordance (4/5) occurred in 2 cases (6.1%) due to differences regarding genetic counseling. Inter-rater agreement was perfect (Cohen’s kappa = 1.00), and the mean concordance score was 4.94 out of 5. The model achieved an overall accuracy of 93.9%, precision of 93.9%, recall of 100%, and F1 score of 96.8%. Conclusion: GPT-4o demonstrated a high level of agreement with expert multidisciplinary decisions in breast cancer care when provided with structured clinical input. These findings support its potential as a reproducible, guideline-consistent decision-support tool in oncology workflows.

Article activity feed