Detection of Intracranial Hemorrhage from Computed Tomography Images: Diagnostic Role and Efficacy of ChatGPT-4o

Mustafa Koyun
Zeycan Kubra Cevval
Bahadir Reis
Bunyamin Ece

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background/Objectives: The role of artificial intelligence (AI) in radiological image analysis is rapidly evolving. This study evaluates the diagnostic performance of Chat Generative Pre-trained Transformer Omni (GPT-4 Omni) in detecting intracranial hemorrhages (ICHs) in non-contrast computed tomography (NCCT) images, along with its ability to classify hemorrhage type, stage, anatomical location, and associated findings. Methods: A retrospective study was conducted using 240 cases, comprising 120 ICH cases and 120 controls with normal findings. Five consecutive NCCT slices per case were selected by radiologists and analyzed by ChatGPT-4o using a standardized prompt with nine questions. Diagnostic accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated by comparing the model’s results with radiologists’ assessments (the gold standard). After a two-week interval, the same dataset was re-evaluated to assess intra-observer reliability and consistency. Results: ChatGPT-4o achieved 100% accuracy in identifying imaging modality type. For ICH detection, the model demonstrated a diagnostic accuracy of 68.3%, sensitivity of 79.2%, specificity of 57.5%, PPV of 65.1%, and NPV of 73.4%. It correctly classified 34.0% of hemorrhage types and 7.3% of localizations. All ICH-positive cases were identified as acute phase (100%). In the second evaluation, diagnostic accuracy improved to 73.3%, with a sensitivity of 86.7% and a specificity of 60%. The Cohen’s Kappa coefficient for intra-observer agreement in ICH detection indicated moderate agreement (κ = 0.469). Conclusions: ChatGPT-4o shows promise in identifying imaging modalities and ICH presence but demonstrates limitations in localization and hemorrhage type classification. These findings highlight its potential for improvement through targeted training for medical applications.

Version published to 10.3390/diagnostics15020143
Jan 9, 2025
Version published to 10.20944/preprints202412.1475.v1
Dec 18, 2024

Clinical Evaluation of a PACS-Integrated Deep Learning Tool for Intracranial Hemorrhage Severity Assessment: Comparison with LLM-Based Report Interpretation

This article has 13 authors:
1. Santiago Cepeda
2. Olga Esteban-Sinovas
3. Ignacio Arrese
4. Trinidad Escudero
5. Jesús Garzón
6. María Hernández
7. Teresa Guerra
8. Pilar Sanz
9. Hermógenes Calero-Aguilar
10. Francisco Herrero
11. Juan José Jiménez González
12. Diego Hernán Ferradal
13. Rosario Sarabia
This article has no evaluationsLatest version Feb 23, 2026
Non-contrast CT radiology-clinical machine learning modeling to predict chronic hydrocephalus after aneurysmal subarachnoid hemorrhage

This article has 5 authors:
1. Haiyun Yu^
2. Muyun Luo
3. Hanlong Guo
4. Zecun Huang
5. Qiuxiang Xiao
This article has no evaluationsLatest version Mar 15, 2026
Differentiation of Intracranial Dural Metastases and Meningiomas Using DSC Perfusion MRI and Machine Learning

This article has 5 authors:
1. Seyit Erol
2. Halil Özer
3. Ahmet Baytok
4. Ayşe Arı
5. Hakan Cebeci
This article has no evaluationsLatest version Mar 5, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Clinical Evaluation of a PACS-Integrated Deep Learning Tool for Intracranial Hemorrhage Severity Assessment: Comparison with LLM-Based Report Interpretation

Non-contrast CT radiology-clinical machine learning modeling to predict chronic hydrocephalus after aneurysmal subarachnoid hemorrhage

Differentiation of Intracranial Dural Metastases and Meningiomas Using DSC Perfusion MRI and Machine Learning