Exploring the role of Artificial Intelligence in Bariatric and Metabolic Surgery
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Introduction Artificial intelligence (AI) has gained increasing relevance in bariatric and metabolic surgery (BMS), yet its concordance with expert clinical judgement remains uncertain. This study evaluates the agreement between two AI models and bariatric and metabolic surgeons across representative clinical scenarios, assessing the impact of directed exposure to bibliography. Material and methods Ten evidence-based clinical scenarios were constructed using high-quality bibliography. Two AI models, ChatGPT-4 and DeepSeek, along with board-certified Mexican BMS surgeons, evaluated surgical candidacy and determined whether each patient was candidate for surgery, as well as the most appropriate procedure, under two conditions: without access to high-quality bibliography (Phase 1), and after reviewing it (Phase 2). Majority surgeon responses served as the reference standard. Agreement and concordance were analyzed descriptively and with Cohen’s kappa (p < 0.05). Results Thirty surgeons completed both phases. In Phase 1, surgeons agreement varied, ranging from 96.6% to 23.3% in selected cases; when comparing to AI models, ChatGPT-4 showed 60% agreement (K = 0.344), while DeepSeek 70% (k = 0.508). In Phase 2, surgeons demonstrated 90% agreement across phases (k = 0.787). ChatGPT-4 showed 60% agreement without significant concordance (k = 0.048), whereas DeepSeek maintained 70% agreement, with fair concordance (k = 0.318). Conclusions Surgeons maintained high consistency between phases, while AI models showed variable alignment with clinical decision-making. DeepSeek demonstrated higher adaptability to bibliographic evidence, whereas ChatGPT-4 didn’t. These findings highlight the need for continued refinement, contextual training and validation of AI tools to ensure safe and clinically applicable support in BMS decision-making.