Prospective Multicenter Validation of Machine Learning Models for Mortality Prediction in Adult Critically Ill Patients using Transfer Learning

Ioannis Papapanagiotou
Charikleia S. Vrettou
Maria Theodorakopoulou
Zafiria Mastora
Vassiliki Giannopoulou
Olga Kampouropoulou
Apostolos Karalis
Maria Pratikaki
Spyretta Golemati
Georgios Poupouzas
Vasileios Issaris
Kyriakos Karkoulias
Sofia Pouriki
Chrysi Keskinidou
Nikolaos S. Lotsios
Anastasia Kotanidou
Alice G. Vassiliou
Stavros Papapanagiotou
Ioanna Dimopoulou

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Mortality prediction in critically ill patients remains challenging due to poor cross-institutional performance and limited generalizability of machine learning models. This study addresses this, by systematically benchmarking and prospectively validating transfer learning frameworks. We trained our models on MIMIC-IV and validated them on a multicenter prospective cohort of 539 patients from three hospitals. We compared tree-based methods and modern deep learning architectures for tabular data. Results demonstrated that both Domain Adaptation (DA) and Inductive Transfer Learning (ITL) significantly enhanced model performance under realistic conditions where target-domain data are limited. DA consistently improved discrimination across all evaluated models, with LightGBM showing the most significant gains in Area Under the Receiver Operating Characteristic Curve (AUC) (p = 0.0010), and XGBoost yielding the largest improvements in Area Under the Precision-Recall Curve (AUPRC) (p = 0.0419). Among all evaluated models, Random Forest (RF) achieved the highest discriminative performance, achieving 90.7% AUC with DA and 81.3% AUPRC with ITL. Notably, the domain-adapted models significantly outperformed APACHE II (p = 0.0044) and SOFA (p = 0.0077). These findings suggest that transfer learning provides a robust and data-efficient pathway for improving model generalizability across heterogeneous populations, offering a pragmatic solution to the challenge of model degradation in clinical deployment.

Version published to 10.21203/rs.3.rs-8872055/v1 on Research Square
Feb 19, 2026

Evaluating the Predictive Accuracy of Deep Learning Algorithms for Postoperative Mortality in Cardiac Surgery: A Systematic Review and Meta-Analysis

This article has 9 authors:
1. Ibrahim Ibrahim Shuaibu
2. Ahmad Yaseen Al Mahmoud
3. Ibrahem Aaroud
4. Abdalsalam Rizq Abazid
5. Mohamed Helmy Mohamed Abdelsalaam
6. Numaira Naeem Gazge
7. Mazen Mohammed Saad Alabed
8. Shahd Eltayeb
9. Sobhan Pahlavan Zadeh
This article has no evaluationsLatest version Mar 31, 2026
Dynamic Landmark-Based Prediction of Sepsis Using Interpretable and Balanced Machine Learning Models in Respiratory-Supported Critically ill Patients

This article has 7 authors:
1. Ayao Sangenis Assogba
2. Jennifer H. Gladius
3. Komi Selassi Gayi
4. Samadou Tchakondo
5. Yendouname Kandjoni
6. Richard Sagacity Tugbeh
7. Rachana Das
This article has no evaluationsLatest version Mar 25, 2026
Case-Control Matching Erodes Feature Discriminability for Machine Learning-Based Sepsis Prediction in ICUs: A Retrospective Cohort Study

This article has 6 authors:
1. Sophia Ehlers
2. Youssef Farag
3. Fanny Tranchellini
4. Tim Hahn
5. Catherine Jutzeler
6. Lakmal Meegahapola
This article has no evaluationsLatest version Apr 9, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Evaluating the Predictive Accuracy of Deep Learning Algorithms for Postoperative Mortality in Cardiac Surgery: A Systematic Review and Meta-Analysis

Dynamic Landmark-Based Prediction of Sepsis Using Interpretable and Balanced Machine Learning Models in Respiratory-Supported Critically ill Patients

Case-Control Matching Erodes Feature Discriminability for Machine Learning-Based Sepsis Prediction in ICUs: A Retrospective Cohort Study