What can reader studies of radiologist use of AI models teach us about adaptation? A signal detection modelling exploration

Lana Tikhomirov
Luke Smith
Alix Bird
Lyle Palmer
Nikhil Cherian Kurian
Carolyn Semmler

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

A significant component in regulatory approval of diagnostic artificial intelligence (AI) models is the use of reader study evidence, diagnostic performance studies of radiologists on a curated set of images. Retrospective MRMC (multi-reader, multi-case) reader studies of AI use are commonly conducted in the field of medical AI, though they tend to overestimate AI performance and underestimate human performance through the use of aggregated metrics. Further, there has been limited investigation into the impact of diagnostic AI models on radiological decision-making, with few if any cognitive modelling studies undertaken. This has serious implications, as the evidence base needed for implementation of models is lacking verified cognitive research. As such, this paper explores a lost opportunity to model diagnostic reader studies to ask: what can reader studies teach us about radiologist adaptation to AI models?In this study, we used a 2021 reader study of a commercially available Australian diagnostic system by Harrison.ai, consisting of 20 expert radiologists reading 1163 cases. We fit hierarchical signal detection models at the individual and pathology level to a reader study of 20 expert radiologists over 1163 cases and 127 pathologies. Our modelling results indicated that radiologists at the pathology and individual level had increased discriminability with AI, though with a more liberal response bias, leading to higher false positive rates. Further exploratory analysis based on these results indicated that disease coverage rates became homogenised with AI, as well as different AI-generated segmentations potentially influencing correct rejection rates. We provide real-world interpretations of these findings.

Version published to 10.31234/osf.io/zm94n_v3 on OSF Preprints
Feb 11, 2026
Version published to 10.31234/osf.io/zm94n on OSF Preprints
Oct 14, 2024

How to Evaluate Medical AI

This article has 8 authors:
1. Ilia Kopanichuk
2. Petr Anokhin
3. Vladimir Shaposhnikov
4. Vladimir Makharev
5. Ekaterina Tsapieva
6. Iaroslav Bespalov
7. Dmitry Dylov
8. Ivan Oseledets
This article has no evaluationsLatest version Jan 22, 2026
Segmenting with Confidence: Uncertainty Quantification for Brain Tumor Imaging

This article has 8 authors:
1. Yassine Guennoun
2. Pierre Nedelec
3. Mark McArthur
4. Evan Bloch
5. Jinchi Wei
6. Leo Sugrue
7. Evan Calabrese
8. Andreas Rauschecker
This article has no evaluationsLatest version Jan 9, 2026
A scoping review of silent trials for medical artificial intelligence

This article has 25 authors:
1. Lana Tikhomirov
2. Carolyn Semmler
3. Noah Prizant
4. Srijan Bhasin
5. Georgia Kenyon
6. Anton Van Der Vegt
7. Lauren Erdman
8. Lyle Palmer
9. Abdullahi Mohamud
10. Judy Gichoya
11. Seyi Soremekun
12. Mark Sendak
13. James Anderson
14. Stephen Pfohl
15. Ian Stedman
16. Daniel Ehrmann
17. Karin Verspoor
18. Jethro Kwong
19. Lesley-Anne Farmer
20. Alex John London
21. Ismail Akrout
22. Shalmali Joshi
23. Elena Dicus
24. Xiaoxuan Liu
25. Melissa McCradden
This article has no evaluationsLatest version Feb 5, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

How to Evaluate Medical AI

Segmenting with Confidence: Uncertainty Quantification for Brain Tumor Imaging

A scoping review of silent trials for medical artificial intelligence