Multireader evaluation of radiologist performance for COVID-19 detection on emergency department chest radiographs
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Article activity feed
-
-
SciScore for 10.1101/2021.10.20.21265278: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: Patients: Our institution’s Institutional Review Board reviewed and approved this retrospective study (STUDY00000506) and granted a waiver of HIPAA authorization and written informed consent.
Consent: Patients: Our institution’s Institutional Review Board reviewed and approved this retrospective study (STUDY00000506) and granted a waiver of HIPAA authorization and written informed consent.Sex as a biological variable not detected. Randomization 100 cases (50 RT-PCR positive and 50 RT-PCR negative) were randomly selected to be read by all 10 readers and the remaining 1927 cases were randomly assigned to be read by two readers. Blinding Readers were blinded to RT-PCR results and instructed to … SciScore for 10.1101/2021.10.20.21265278: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: Patients: Our institution’s Institutional Review Board reviewed and approved this retrospective study (STUDY00000506) and granted a waiver of HIPAA authorization and written informed consent.
Consent: Patients: Our institution’s Institutional Review Board reviewed and approved this retrospective study (STUDY00000506) and granted a waiver of HIPAA authorization and written informed consent.Sex as a biological variable not detected. Randomization 100 cases (50 RT-PCR positive and 50 RT-PCR negative) were randomly selected to be read by all 10 readers and the remaining 1927 cases were randomly assigned to be read by two readers. Blinding Readers were blinded to RT-PCR results and instructed to treat each image as a person under investigation (PUI) to reflect the common emergency department workflow where patient information may be unknown and symptoms such as cough, dyspnea, shortness of breath, or fever may overlap. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources Descriptive statistical analysis was performed using the Pandas and scikit-learn Python library. (11) Data visualizations were generated using matplotlib and Seaborn in Python. ( matplotlibsuggested: (MatPlotLib, RRID:SCR_008624)(12,13) Interreader agreement was calculated using the Fleiss κ score as implemented in the nltk.metrics.agreement Python (14) module. Pythonsuggested: (IPython, RRID:SCR_001658)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:This limitation may still exist in underserved regions both domestically and internationally or as new waves such as the recent surge due to the Delta variant occur and strain healthcare resources. Our study demonstrates three major findings, 1) that there is low utility of CXRs in diagnosing patients who will be COVID-19 positive, 2) clinical history is not useful in improving the radiologist performance for COVID-19 diagnosis, and 3) CXRs are more useful for excluding COVID-19 diagnosis with a consistent level of performance across a diverse group of radiologists. There was a very low rate of assigning COVID-19 labels for RT-PCR negative patients, totaling 0.6% (6/950) when two radiologists agreed and 4.0% (38/950) when they disagreed. This observation persists even with the introduction of clinical history of chest pain, cough, infection, PUI or shortness of breath. For RT-PCR positive patients, we find that radiologists have poor interrater agreement of labels (Fleiss score 0.36), and nonspecific performance for diagnosing COVID-19 with wide distribution of the remaining labels and low sensitivity for detection. Two readers agreed on a COVID-19 diagnosis only 15.9% (110/688) of the time, and one of two readers labeled COVID-19 23.8% (164/688) of the time. This performance does not improve even when the clinical history is provided, and improves only slightly with >5 years of experience among the readers. We also did not note any increase in COVID-19 diagnostic performance...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-