A hybrid computer vision model to predict lung cancer in diverse populations
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Importance
Disparities of lung cancer incidence exist in Black populations which have the highest rates of lung cancer in the US. Current age and tobacco-based screening criteria underserve Black populations due to disparately elevated rates of lung cancer in the screening eligible population. Individualized risk-based screening is a potential means to mitigate these disparities.
Objective
To evaluate prediction models that integrate clinical and imaging-based features to individualize lung cancer risk as an alternative to existing screening criteria associated with disparities.
Design
This cross-sectional study utilized participants from the National Lung Screening Trial (NLST) and a population-based cohort of University of Illinois Health system (UIH) of individuals at risk of lung cancer with available lung CT imaging and follow up between the years 2015 and 2024.
Setting
Multicenter (NLST) and population based (UIH, urban and suburban Cook County)
Participants
53,452 in NLST and 11,654 in UIH were included based on age and tobacco use based risk factors for lung cancer. Cohorts were used for both training and testing of deep and machine learning models using clinical features alone or in combination with CT image features (hybrid computer vision).
Exposure
Continuous model risk predictions for multiple years after CT imaging ranging from 0 to 100 with increasing values indicating a higher likelihood of lung cancer.
Main Outcomes and Measures
Accuracy of models based on receiver operating characteristic curve area under the curve (ROC-AUC) across different racial and other demographic groups.
Results
An optimized 7 clinical feature model achieved ROC-AUC values ranging 0.64-0.67 in NLST and 0.60-0.65 in UIH cohorts across multiple years. Incorporation of imaging features to form a hybrid computer vision model significantly improved ROC-AUC values to 0.78-0.91 in NLST but deteriorated in UIH with ROC-AUC values of 0.68-0.80, attributable to Black participants where ROC-AUC values ranged from 0.63-0.72 across multiple years. Retraining the hybrid-Computer Vision model by incorporating Black and other participants from the UIH cohort improved performance with ROC-AUC values of 0.70-0.87 in a held out UIH test set.
Conclusions and Relevance
Hybrid computer vision demonstrated individualized lung cancer risk predictions with improved accuracy compared to clinical risk models alone. However, potential biases in image training data reduced model generalizability in Black participants. Performance was improved upon retraining with a subset of the UIH cohort, suggesting that inclusive training and validation datasets can minimize racial disparities thereby improving health equity upon clinical use.
KEY POINTS
Question
Can artificial intelligence (AI) tools for cancer detection using clinical information and screening low dose lung CT equitably estimate the development of lung cancer in diverse populations to overcome racial disparities in lung cancer screening?
Findings
In this study of 53,452 participants from the National Lung Screening Trial and 11,654 participants in University of Illinois Health cohorts, a Hybrid computer vision model predicted development of lung cancer with high accuracy based on Receiver Operating Characteristic (ROC) AUC performance of 0.78-0.91 years 1-6 after low dose CT. A disparity in performance was observed in Black, but not White participants, in the UIH cohort with ROC-AUC ranging from 0.63-0.72. Retraining the computer vision component of the model with a greater number of Black subjects improved ROC-AUC performance to 0.70-0.87.
Meaning
This study suggests that AI tools can be optimized to equitably estimate future lung cancer risk to identify high risk individuals who may benefit from preventive measures, including supplemental screening.