Assessing the feasibility of using GPT models for clinical decision support in patients suspected of prostate cancer: a comparative study

Xuan Liang
Xiaoyi Wang
Yuanyuan Li
Wenfeng Liao
Zhenting Zhang
Guohui Zhu
Xi Wei

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Large Language Models (LLMs), such as the GPT model, leverage supervised learning and reinforcement learning with human labels for fine-tuning. Despite showing promise in various medical fields, the feasibility and safety of using GPT models for clinical decision support (CDS) in prostate cancer remain unverified. This study aims to evaluate the feasibility of GPT models in providing CDS for patients suspected of prostate cancer by comparing the recommendations generated by GPT models with those provided by real-world urologists. Methods Patient data were collected from March 2022 to December 2023 from Tianjin Medical University Cancer Institute and Hospital and Tianjin Medical University Second Hospital. A total of 113 cases with comprehensive clinical and imaging data were selected. Clinical recommendations were generated by GPT models (ChatGPT and GPT-3.5) and compared with those provided by a non-oncology specialized urologist. The recommendations were evaluated by three prostate cancer experts for coherence, factual consistency, comprehensiveness, and potential medical harm using a 5-point Likert scale. Mann-Whitney U tests were employed to determine significant differences. Results The GPT models demonstrated high factual consistency (98.1% in high consistency group) and coherence in generating clinical recommendations. In terms of medical harm, no significant difference was observed overall between GPT models and the non-oncology urologist ( p ≥ 0.05). However, in cases rated neutral (score = 3), the non-oncology urologist showed higher rates of ambiguous recommendations (10.5%) compared to GPT models (2.8%, p < 0.05). The GPT models' response time was significantly faster, averaging 5–15 seconds per case versus approximately 1 minute for the urologist. Conclusion GPT models show promise in providing clinical decision support for patients suspected of prostate cancer, with high factual consistency and efficient response times. However, challenges such as comprehensiveness and potential medical harm need to be addressed before widespread clinical application. Further research is warranted to validate the empowering effect of GPT models on non-specialist clinicians in clinical decision-making.

Version published to 10.21203/rs.3.rs-4885411/v1 on Research Square
Oct 18, 2024

Machine learning models for predicting severe clinical events in hospitalized patients with coronary artery disease

This article has 16 authors:
1. Hao Liu
2. Meijun Liu
3. Xinmiao Guan
4. Feng Cao
5. Changhao Liang
6. Zhongwen Qi
7. Jiaqi Hui
8. Junnan Zhao
9. Jingli Xing
10. Jianguo Zhou
11. Dong Zhang
12. Lei Liu
13. Xiaoliang Hao
14. Minjing Luo
15. Fengqin Xu
16. Yutong Fei
This article has no evaluationsLatest version Jan 12, 2026
Impact of early palliative care on patients with advanced lung cancer: a retrospective real- world cohort study using the TriNetX network

This article has 7 authors:
1. Pi-Hua Chang
2. Ay-Line Ke
3. Wei-Min Chu
4. Hsin-Hua Chen
5. Chia-Hui Yu
6. Pin-Hua Lin
7. Mei-Yu Chang
This article has no evaluationsLatest version Dec 22, 2025
An enhanced explainable thyroid disease diagnosis by leveraging cluster-smote and machine learning models

This article has 4 authors:
1. Usman Suleh
2. Badamasi Alhaji Ahmed
3. Farouk Lawan Gambo
4. Fatima Umar Zambuk
This article has no evaluationsLatest version Jan 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Machine learning models for predicting severe clinical events in hospitalized patients with coronary artery disease

Impact of early palliative care on patients with advanced lung cancer: a retrospective real- world cohort study using the TriNetX network

An enhanced explainable thyroid disease diagnosis by leveraging cluster-smote and machine learning models