Deep Learning of Suboptimal Spirometry to Predict Respiratory Outcomes and Mortality

Michael Cho
Davin Hill
Max Torop
Aria Masoomi
Peter Castaldi
Edwin Silverman
Sandeep Bodduluri
Surya Bhatt
Taedong Yun
Cory McLean
Farhad Hormozdiari
Jennifer Dy
Brian Hobbs

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Importance: Obtaining spirometry requires repeated testing and using the maximal values based on quality control criteria. Whether the suboptimal efforts are useful for the prediction of respiratory outcomes is not clear. Objective: To determine whether a machine learning model could predict respiratory outcomes and mortality based on suboptimal spirometry. Design: Observational cohorts (UK Biobank and COPDGene). Setting: Multi-center; population, and disease-enriched. Participants: UK aged 40-69; US aged 45-80, >10 pack-years smoking, without respiratory diseases other than COPD or asthma. Exposures: Raw spirograms (volume-time). Main outcomes and measures: To create a combined representation of lung function we implemented a contrastive learning approach, Spirogram-based Contrastive Learning Framework (Spiro-CLF), which utilized all recorded volume-time curves per participant and applied different transformations (e.g. flow-volume, flow-time). We defined “maximal” efforts as those passing quality control (QC) with the maximum FVC; all other efforts, including submaximal and QC-failing efforts, were defined as “suboptimal”. We trained the Spiro-CLF model using both maximal and suboptimal efforts from the UK Biobank. We tested the model in a held-out 20% testing UK Biobank subset and COPDGene, on 1) binary predictions of FEV1/FVC <0.7, and FEV1 Percent Predicted (FEV1PP) <80%, 2) Cox regression for all-cause mortality, and 3) prediction of respiratory phenotypes. Results: We trained Spiro-CLF on 940,705 volume-time curves from 352,684 UKB participants with 2-3 spirometry efforts per individual (66.7% with 3 efforts) and at least one QC-passing spirometry effort. Of all spirometry efforts, 61.6% were suboptimal (37.5% submaximal and 24.1% QC-failing). In the UK Biobank, Spiro-CLF using QC-failing and submaximal efforts predicted FEV1/FVC < 0.7 with an Area under the Receiver Operating Characteristics (AUROC) of 0.956, mortality with a concordance index of 0.647, and asthma with a 9-42% improvement versus baseline models. In COPDGene (n=10,110 participants), adding QC-passing, submaximal efforts did not improve the prediction of lung function or mortality; however, Spiro-CLF representations predicted asthma and respiratory phenotypes (joint test P ≤ 2 × 10−3). Conclusions and Relevance: A machine-learning model can predict respiratory phenotypes using suboptimal spirometry; results from all spirometry efforts may contain valuable data. Additional studies are required to determine performance and utility in specific clinical scenarios.

Version published to 10.21203/rs.3.rs-6296752/v1 on Research Square
Jun 30, 2025

Prediction of Chronic Obstructive Pulmonary Disease Using Machine Learning Models

This article has 7 authors:
1. Sher Ali
2. Omair Faqah
3. Elise Neubarth
4. Mohammad Shehroz Ashraf
5. Michael DeGiorgio
6. Mark Block
7. Waseem Asghar
This article has no evaluationsLatest version Dec 15, 2025
Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

This article has 6 authors:
1. Thiago Q. Oliveira
2. Leandro A. Carvalho
3. Flávio R. C. Sousa
4. João B. F. Filho
5. Khalil F. Oliveira
6. Daniel A. B. Tavares
This article has no evaluationsLatest version Jan 30, 2026
Voice as a Digital Biomarker: Foundation Model-Based COPD Assessment

This article has 9 authors:
1. Sang Mee Lee
2. Hyein Ryu
3. Sunga Kong
4. Sun Hye Shin
5. Wooseong Huh
6. Myung Jin Chung
7. Juhee Cho
8. Taeyoung Kim
9. Hye Yun Park
This article has no evaluationsLatest version Dec 18, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Prediction of Chronic Obstructive Pulmonary Disease Using Machine Learning Models

Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

Voice as a Digital Biomarker: Foundation Model-Based COPD Assessment