Diagnostic Performance of Universal versus Stratified Computer-Aided Detection Thresholds for Chest X-Ray-Based Tuberculosis Screening

Joowhan Sung
Peter James Kitonsa
Annet Nalutaaya
David Isooba
Susan Birabwa
Keneth Ndyabayunga
Rogers Okura
Jonathan Magezi
Deborah Nantale
Ivan Mugabi
Violet Nakiiza
David W Dowdy
Achilles Katamba
Emily A Kendall

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Computer-aided detection (CAD) software analyzes chest X-rays for features suggestive of tuberculosis (TB) and provides a numeric abnormality score. However, estimates of CAD accuracy for TB screening in general populations are hindered by the lack of confirmatory data among people with lower CAD scores, including those without symptoms. Additionally, the appropriate CAD score thresholds for obtaining further testing may vary according to population and client characteristics.

Methods

We screened for TB in Ugandan individuals aged ≥15 years using portable chest X-rays with CAD (qXR v3). Participants were offered screening regardless of their symptoms. Those with X-ray scores above a pre-specified threshold of 0.1 (range, 0 – 1) were asked to provide sputum for Xpert Ultra testing. We estimated the diagnostic accuracy of CAD for detecting Xpert-positive TB when using the same threshold for all individuals (under different assumptions about TB prevalence among people with X-ray scores <0.1), and compared this estimate to age- and/or sex-stratified approaches.

Findings

Of 52,835 participants screened for TB using CAD, 8,949 (16.9%) had X-ray scores ≥0.1. Of 7,219 participants with valid Xpert Ultra results, 382 (5.3%) were Xpert-positive, including 81 with trace results. Assuming 0.1% of participants with X-ray scores <0.1 would have been Xpert-positive if tested, qXR had an estimated AUC of 0.928 (95% confidence interval [CI] 0.910-0.943) for Xpert-positive TB. Stratifying CAD thresholds according to age and sex improved accuracy; for example, at 95.4% specificity, estimated sensitivity was 75.0% for a universal threshold (of ≥0.54) versus 77.1% for thresholds stratified by age and sex (p=0.032).

Interpretation

The accuracy of CAD in screening for TB, among general populations irrespective of symptoms, is higher than previously estimated. Stratifying CAD thresholds based on client characteristics such as age and sex could further improve accuracy, enabling a more effective and personalized approach to TB screening.

Funding

National Institutes of Health

Research in context

Evidence before this study

The World Health Organization (WHO) has endorsed computer-aided detection (CAD) as a screening tool for tuberculosis (TB), but the appropriate CAD score that triggers further diagnostic evaluation for tuberculosis varies by population. The WHO recommends determining the appropriate CAD threshold for specific settings and population and considering unique thresholds for specific populations, including older age groups, among whom CAD may perform poorly. We performed a PubMed literature search for articles published until September 9, 2024, using the search terms “tuberculosis” AND (“computer-aided detection” OR “computer aided detection” OR “CAD” OR “computer-aided reading” OR “computer aided reading” OR “artificial intelligence”), which resulted in 704 articles. Among them, we identified studies that evaluated the performance of CAD for tuberculosis screening and additionally reviewed relevant references. Most prior studies reported area under the curves (AUC) ranging from 0.76 to 0.88 but limited their evaluations to individuals with symptoms or abnormal chest X-rays. Some prior studies identified subgroups (including older individuals and people with prior TB) among whom CAD had lower-than-average AUCs, and authors discussed how the prevalence of such characteristics could affect the optimal value of a population-wide CAD threshold; however, none estimated the accuracy that could be gained with adjusting CAD thresholds between individuals based on personal characteristics.

Added value of this study

In this study, all consenting individuals in the general population of a high-prevalence setting were offered chest X-ray screening, regardless of symptoms, if they were ≥15 years old, not pregnant, and not on TB treatment. A very low CAD score cutoff (qXR v3 score of 0.1 on a 0-1 scale) was used to select individuals for confirmatory sputum molecular testing, enabling the detection of radiographically mild forms of TB and facilitating comparisons of diagnostic accuracy at different CAD thresholds. With this more expansive, symptom-neutral evaluation of CAD, we estimated an AUC of 0.928, and we found that the qXR v3 threshold needed to decrease to near 0.1 to meet the WHO target product profile goal of ≥90% sensitivity and ≥70% specificity. Compared to using the same thresholds for all participants, adjusting CAD thresholds by age and/or sex strata resulted in a 1% to 2% increase in sensitivity without affecting specificity.

Implications of all the available evidence

To obtain high sensitivity with CAD screening in high-prevalence settings, low score thresholds may be needed. However, countries with a high burden of TB often do not have sufficient resources to test all individuals above a low threshold. In such settings, adjusting CAD thresholds based on individual characteristics associated with TB prevalence (e.g., male sex) and those associated with false-positive X-ray results (e.g., old age) can potentially improve the efficiency of TB screening programs.

Version published to 10.1101/2025.04.09.25325458v1 on medRxiv
Apr 10, 2025

Pulmonary tuberculosis prediction using CAD4TB artificial intelligence (computer-aided detection for tuberculosis) based on thoracic x-ray photos among Indonesian subjects in hospital

This article has 12 authors:
1. Erlina Burhan
2. Maryastuti
3. Listi Wulandari
4. Diah Handayani
5. Salsabila Rezkia Andini
6. Anandya Naufal Rahadhi
7. Gde Ngurah Irfan Bhaskara
8. Khansa Putrirana
9. Ariestiana Ayu Ananda Latifa
10. Ahmad Fadhil Ilham
11. Ihya Akbar
12. M Prasetio Wardoyo
This article has no evaluationsLatest version Apr 22, 2025
Evaluation of an ultra-portable X-ray system with automated interpretation for tuberculosis active case finding in carceral settings: a diagnostic test accuracy study

This article has 14 authors:
1. Argita D. Salindri
2. José V. B. Bampi
3. Caroline Busatto
4. Alessandra M. da Silva
5. Andrea da Silva Santos
6. Isabella B. Gonçalves
7. Thais O. Gonçalves
8. Eunice A. T. Cunha
9. Daniel Tsuha
10. Everton Lemos
11. Roberto D. Oliveira
12. Mariana Croda
13. Jason R. Andrews
14. Julio Croda
This article has no evaluationsLatest version Apr 7, 2025
Diagnostic Accuracy of an Abbreviated vs. a Full MRI Breast Protocol in Detecting Breast Cancer: An ROC Study

This article has 4 authors:
1. Francis Zarb
2. Deborah Mizzi
3. Paul Bezzina
4. Leanne Galea
This article has no evaluationsLatest version May 6, 2025

Listed in

Abstract

Background

Methods

Findings

Interpretation

Funding

Research in context

Evidence before this study

Added value of this study

Implications of all the available evidence

Article activity feed

Related articles

Pulmonary tuberculosis prediction using CAD4TB artificial intelligence (computer-aided detection for tuberculosis) based on thoracic x-ray photos among Indonesian subjects in hospital

Evaluation of an ultra-portable X-ray system with automated interpretation for tuberculosis active case finding in carceral settings: a diagnostic test accuracy study

Diagnostic Accuracy of an Abbreviated vs. a Full MRI Breast Protocol in Detecting Breast Cancer: An ROC Study