Developing A Data Pipeline to Quantify Ventilator Waveforms

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

O bjective

To automatically collect ventilator waveforms and integrate them with curated electronic health record data from thousands of patients to provide the data necessary to analyze the complex interactions between lung injury, patient effort, ventilator dyssynchrony, and ventilator mechanics. Such datasets do not currently exist, hampering the understanding of ventilator trajectories.

D esign

A prospective, observational study which utilizes a multidisciplinary team of data scientists, biomedical engineers, and clinicians to develop an automated pipeline collecting high-fidelity ventilator waveform data and integrating these data with electronic health record data, including vital signs, sedation and agitation scores, lab results, and medications (drug, dose, and route). Importantly, electronic health record data are collected over a patient’s entire hospital course, allowing for a complete description of patient trajectories.

S ettings

The Medical Intensive Care Unit at the University of Colorado.

P atients

All mechanically ventilated adult patients.

I nterventions

None.

M easurements

Automated collection of high-fidelity ventilator waveforms and electronic health record data.

R esults

Between July 2023 and May 2025, we collected data from 1,116 patients, 968 (87%) of whom had >12 hours of mechanical ventilation. These patients generated 4,767 ventilator days (>13 ventilator years) of analyzable ventilator waveforms and had a median duration of ventilation of 2.6[1.25, 6.06] days. Over 146 million breaths were segmented from the waveforms, of which 49 million breaths were able to fit the linear single-compartment model accurately and had a median compliance of 35.7 [25.2, 45.3] mL/H 2 O. Electronic health record data was linked to the waveforms to provide 8,511 [3,835, 17,040] records per patient. These data constitute a comprehensive database for studying the effects of mechanical ventilation, patient effort, ventilator dyssynchrony, and key non-ventilator covariates, such as sedation, across a large and heterogeneous cohort of patients.

C onclusion

We created a fully automated data pipeline to continuously collect mechanical ventilation waveform data and integrate it with detailed EHR data to generate a unique, high-fidelity dataset that will be crucial for understanding the complex relationships among lung injury, patient effort, sedation, ventilator dyssynchrony, and ventilator mechanics.

K ey P oints

Q uestion

To create an automated data pipeline to collect and integrate continuous ventilator waveform data with electronic health record data.

F indings

Between July 2023 and May 2025, we automatically collected data from 1,116 patients, 968 (87%) of whom received mechanical ventilation for more than 12 hours.

M eaning

This fully automated data collection pipeline will facilitate advances in understanding the complex relationships between lung injury, patient effort, sedation, ventilator dyssynchrony, and ventilator mechanics.

Article activity feed