Assessing the feasibility of using GPT models for clinical decision support in patients suspected of prostate cancer: a comparative study

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background Large Language Models (LLMs), such as the GPT model, leverage supervised learning and reinforcement learning with human labels for fine-tuning. Despite showing promise in various medical fields, the feasibility and safety of using GPT models for clinical decision support (CDS) in prostate cancer remain unverified. This study aims to evaluate the feasibility of GPT models in providing CDS for patients suspected of prostate cancer by comparing the recommendations generated by GPT models with those provided by real-world urologists. Methods Patient data were collected from March 2022 to December 2023 from Tianjin Medical University Cancer Institute and Hospital and Tianjin Medical University Second Hospital. A total of 113 cases with comprehensive clinical and imaging data were selected. Clinical recommendations were generated by GPT models (ChatGPT and GPT-3.5) and compared with those provided by a non-oncology specialized urologist. The recommendations were evaluated by three prostate cancer experts for coherence, factual consistency, comprehensiveness, and potential medical harm using a 5-point Likert scale. Mann-Whitney U tests were employed to determine significant differences. Results The GPT models demonstrated high factual consistency (98.1% in high consistency group) and coherence in generating clinical recommendations. In terms of medical harm, no significant difference was observed overall between GPT models and the non-oncology urologist ( p  ≥ 0.05). However, in cases rated neutral (score = 3), the non-oncology urologist showed higher rates of ambiguous recommendations (10.5%) compared to GPT models (2.8%, p <  0.05). The GPT models' response time was significantly faster, averaging 5–15 seconds per case versus approximately 1 minute for the urologist. Conclusion GPT models show promise in providing clinical decision support for patients suspected of prostate cancer, with high factual consistency and efficient response times. However, challenges such as comprehensiveness and potential medical harm need to be addressed before widespread clinical application. Further research is warranted to validate the empowering effect of GPT models on non-specialist clinicians in clinical decision-making.

Article activity feed