Deep Learning of Suboptimal Spirometry to Predict Respiratory Outcomes and Mortality
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Importance: Obtaining spirometry requires repeated testing and using the maximal values based on quality control criteria. Whether the suboptimal efforts are useful for the prediction of respiratory outcomes is not clear. Objective: To determine whether a machine learning model could predict respiratory outcomes and mortality based on suboptimal spirometry. Design: Observational cohorts (UK Biobank and COPDGene). Setting: Multi-center; population, and disease-enriched. Participants: UK aged 40-69; US aged 45-80, >10 pack-years smoking, without respiratory diseases other than COPD or asthma. Exposures: Raw spirograms (volume-time). Main outcomes and measures: To create a combined representation of lung function we implemented a contrastive learning approach, Spirogram-based Contrastive Learning Framework (Spiro-CLF), which utilized all recorded volume-time curves per participant and applied different transformations (e.g. flow-volume, flow-time). We defined “maximal” efforts as those passing quality control (QC) with the maximum FVC; all other efforts, including submaximal and QC-failing efforts, were defined as “suboptimal”. We trained the Spiro-CLF model using both maximal and suboptimal efforts from the UK Biobank. We tested the model in a held-out 20% testing UK Biobank subset and COPDGene, on 1) binary predictions of FEV1/FVC <0.7, and FEV1 Percent Predicted (FEV1PP) <80%, 2) Cox regression for all-cause mortality, and 3) prediction of respiratory phenotypes. Results: We trained Spiro-CLF on 940,705 volume-time curves from 352,684 UKB participants with 2-3 spirometry efforts per individual (66.7% with 3 efforts) and at least one QC-passing spirometry effort. Of all spirometry efforts, 61.6% were suboptimal (37.5% submaximal and 24.1% QC-failing). In the UK Biobank, Spiro-CLF using QC-failing and submaximal efforts predicted FEV1/FVC < 0.7 with an Area under the Receiver Operating Characteristics (AUROC) of 0.956, mortality with a concordance index of 0.647, and asthma with a 9-42% improvement versus baseline models. In COPDGene (n=10,110 participants), adding QC-passing, submaximal efforts did not improve the prediction of lung function or mortality; however, Spiro-CLF representations predicted asthma and respiratory phenotypes (joint test P ≤ 2 × 10−3). Conclusions and Relevance: A machine-learning model can predict respiratory phenotypes using suboptimal spirometry; results from all spirometry efforts may contain valuable data. Additional studies are required to determine performance and utility in specific clinical scenarios.