Automated and interoperable methods for generalizable development of clinical machine-learning models for predicting neuromorbidity in critically ill children

Ruoting Li
Christopher M. Horvat
Mehdi Nourelahi
Eddie Pérez Claudio
Jason Hammett
Mark S. Wainwright
Robert S.B. Clark
Alicia K. Au
Harry Hochheiser

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objectives

To streamline the development of clinical machine learning (ML) models for predicting acute neurological morbidity in critically ill children by extending our prior work to create a standardized, reproducible, and scalable workflow leveraging Fast Healthcare Interoperability Resources (FHIR), cloud infrastructure, and automated ML tools.

Methods

We developed workflow for extracting, cleaning, and modeling pediatric intensive care unit (PICU) data, using 168 biomarkers from 7,403 encounters at an academic Children’s hospital between 2020 and 2024. Data were processed and stored in a compliant, secure cloud environment. We evaluated four feature sets: a baseline set from prior work, a complete set, a filtered set, and a light gradient boosting machine (LightGBM)-selected set for prediction of acquired neurological morbidity. Automated ML was used to train, validate, and deploy models, with performance assessed using the area under the receiver operating characteristics curve (AUROC), area under the precision recall curve (AUPRC), F1 score, calibration metrics, and Shapley additive (SHAP) values. A FHIR-based version of the pipeline was also implemented and evaluated on a 2020 subset of the cohort.

Results

Filtered and LightGBM-based feature sets achieved the highest predictive performance, with AUROCs of 0.90 (95% CI: [0.88-0.92]) for both, and AUPRCs of 0.68 (95% CI: [0.63-0.73]) and 0.67(95% CI: [0.62-0.72]), respectively. SHAP value analysis revealed consistent top features across models, with vital signs and key laboratory values prominently ranked. Models trained using FHIR-formatted data from a 2020 cohort (n = 1,339) demonstrated comparable performance to those built on the complete dataset, with an AUROC of 0.87 (95% CI: [0.81-0.93]).

Conclusions

This study demonstrates the feasibility of a cloud-compatible, standards-based approach to clinical ML model development. By leveraging interoperable data formats and automated modeling workflows, this approach supports scalable, reproducible model construction and evaluation, enabling improved efficiency and transparency in clinical decision support.

Version published to 10.1101/2025.08.01.25332805 on medRxiv
Aug 5, 2025

Development of a Machine Learning-Based Predictive Model and Clinically-Oriented Web Application for 30-Day Mortality Following Cardiac Surgery

This article has 6 authors:
1. Telmo Miguel-Medina
2. Susel Góngora Alonso
3. Isabel de la Torre Díez
4. Miriam Blanco Sáez
5. Mª Lourdes del Río Solá
6. Mohammed Amoon
This article has no evaluationsLatest version Dec 10, 2025
Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

This article has 6 authors:
1. Thiago Q. Oliveira
2. Leandro A. Carvalho
3. Flávio R. C. Sousa
4. João B. F. Filho
5. Khalil F. Oliveira
6. Daniel A. B. Tavares
This article has no evaluationsLatest version Jan 30, 2026
Benchmarking Ensemble Machine Learning Algorithms for the Early Prediction of Stroke in Imbalanced Clinical Cohorts: A Comparative Analysis and Decision Curve Assessment

This article has 2 authors:
1. Ibrahim Ibrahim Shuaibu
2. Yousaf Hussain
This article has no evaluationsLatest version Jan 22, 2026

Discuss this preprint

Listed in

Abstract

Objectives

Methods

Results

Conclusions

Article activity feed

Related articles

Development of a Machine Learning-Based Predictive Model and Clinically-Oriented Web Application for 30-Day Mortality Following Cardiac Surgery

Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

Benchmarking Ensemble Machine Learning Algorithms for the Early Prediction of Stroke in Imbalanced Clinical Cohorts: A Comparative Analysis and Decision Curve Assessment