Automated and interoperable methods for generalizable development of clinical machine-learning models for predicting neuromorbidity in critically ill children
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Objectives
To streamline the development of clinical machine learning (ML) models for predicting acute neurological morbidity in critically ill children by extending our prior work to create a standardized, reproducible, and scalable workflow leveraging Fast Healthcare Interoperability Resources (FHIR), cloud infrastructure, and automated ML tools.
Methods
We developed workflow for extracting, cleaning, and modeling pediatric intensive care unit (PICU) data, using 168 biomarkers from 7,403 encounters at an academic Children’s hospital between 2020 and 2024. Data were processed and stored in a compliant, secure cloud environment. We evaluated four feature sets: a baseline set from prior work, a complete set, a filtered set, and a light gradient boosting machine (LightGBM)-selected set for prediction of acquired neurological morbidity. Automated ML was used to train, validate, and deploy models, with performance assessed using the area under the receiver operating characteristics curve (AUROC), area under the precision recall curve (AUPRC), F1 score, calibration metrics, and Shapley additive (SHAP) values. A FHIR-based version of the pipeline was also implemented and evaluated on a 2020 subset of the cohort.
Results
Filtered and LightGBM-based feature sets achieved the highest predictive performance, with AUROCs of 0.90 (95% CI: [0.88-0.92]) for both, and AUPRCs of 0.68 (95% CI: [0.63-0.73]) and 0.67(95% CI: [0.62-0.72]), respectively. SHAP value analysis revealed consistent top features across models, with vital signs and key laboratory values prominently ranked. Models trained using FHIR-formatted data from a 2020 cohort (n = 1,339) demonstrated comparable performance to those built on the complete dataset, with an AUROC of 0.87 (95% CI: [0.81-0.93]).
Conclusions
This study demonstrates the feasibility of a cloud-compatible, standards-based approach to clinical ML model development. By leveraging interoperable data formats and automated modeling workflows, this approach supports scalable, reproducible model construction and evaluation, enabling improved efficiency and transparency in clinical decision support.