Development and multicenter external validation of A Data-Driven Scoring System for Early and Rapid Identification of Sepsis in Emergency Departments

Yanwei Jin
Yinzhao Wang
Xiaodong Huang
David A. Wacker
Michael A. Puskarich
Feng Xie

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Importance

Sepsis is a leading cause of morbidity and mortality worldwide. Timely recognition and treatment in the emergency department (ED), often referred to as the “golden window,” are critical to improving outcomes. Yet, current diagnostic tools either demonstrate limited accuracy or rely on laboratory results that are not immediately available during initial ED evaluation, constraining rapid and reliable sepsis identification in the ED.

Objective

To develop and externally validate a data-driven interpretable score for early identification of sepsis in the ED across three large health systems.

Design, Setting, and Participants

This retrospective cohort study used electronic health records from three health systems. The primary derivation cohort included all ED visits at 11 hospitals within the M Health Fairview system (Minnesota, 2019-2025). Two external cohorts include ED visits from the Beth Israel Deaconess Medical Center (BIDMC, Boston, 2011-2019) extracted from the MIMIC-IV-ED database, and ED visits from the Stanford Health Care (Stanford, 2020-2022) sourced from the MC-MED database. In our analysis, completed in August 2025, we developed the Emergency Sepsis Risk Prediction (ESRP) score using the AutoScore framework. We evaluated its performance against commonly used bedside tools, including quick Sequential Organ Failure Assessment (qSOFA), National Early Warning Score (NEWS), the Modified Early Warning Score (MEWS), and the Rapid Emergency Medicine Score (REMS), as well as logistic regression (LR) and random forest (RF) models.

Main Outcomes and Measures

The primary outcome was sepsis diagnosis during the ED or hospital stay, determined from ICD-9 and ICD-10 discharge codes. Model performance was evaluated using the area under the receiver operating characteristic curve (AUROC) and the area under the precision–recall curve (AUPRC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).

Results

The study included a total of 2,193,244 ED visits across three sites: 1,626,055 in the Minnesota cohort (1,271,865 for model derivation, and 354,190 for internal validation), and 448,804 and 118,385 in the BIDMC and Stanford external validation cohorts, respectively. In Minnesota internal validation, the ESRP score achieved an AUROC of 0.820 (95% CI, 0.810– 0.825) and an AUPRC of 0.054 (95% CI, 0.051–0.058). In the BIDMC cohort, the ESRP achieved an AUROC of 0.838 (95% CI, 0.833–0.842), compared with 0.636 (95% CI, 0.633– 0.640) for qSOFA and 0.760 (95% CI, 0.757–0.765) for NEWS. In the Stanford cohort, the ESRP achieved an AUROC of 0.892 (95% CI, 0.887–0.898), compared with 0.697 (95% CI, 0.684–0.716) for qSOFA and 0.870 (95% CI, 0.861–0.881) for NEWS.

Conclusions

The ESRP score, based on 10 easily obtainable triage variables, provided accurate, generalizable, and interpretable early sepsis identification across diverse ED populations. Its simplicity and strong performance suggest potential for integration into routine ED triage workflows to support timely sepsis care.

Version published to 10.1101/2025.09.27.25336784 on medRxiv
Sep 29, 2025

Rule-Based Electronic Sepsis Alerts Identify High-Risk Patients Despite Poor Diagnostic Accuracy: A Real-World Evaluation and Implications for Machine Learning

This article has 5 authors:
1. Eanna L Lowney
2. Steven G Hirth
3. Laura Fanning BPharm
4. Graeme J Duke
5. Owen Roodenburg
This article has no evaluationsLatest version Jan 13, 2026
Development and External Validation of the SANTANDER Score: A Primary Care Clinical Decision Tool for Cardiovascular Risk Stratification in Colombia

This article has 5 authors:
1. Juan Sebastián Therán-León
2. Claudio Fernando García-Rojas
3. Harold Torres-Pinzón
4. Clara Inés Strauch-Díaz
5. Carlos Enrique Arenas-Moreno
This article has no evaluationsLatest version Jan 29, 2026
Diagnostic Value of Complete Blood Count-Derived Inflammatory Indices for Predicting Adverse Outcomes in Geriatric Patients Presenting to the Emergency Department with Acute Infectious Diarrhea in the MIMIC-IV Database

This article has 3 authors:
1. Mete Ucdal
2. Karya Yurtsever
3. Evren Ekingen
This article has no evaluationsLatest version Jan 30, 2026

Discuss this preprint

Listed in

Abstract

Importance

Objective

Design, Setting, and Participants

Main Outcomes and Measures

Results

Conclusions

Article activity feed

Related articles

Rule-Based Electronic Sepsis Alerts Identify High-Risk Patients Despite Poor Diagnostic Accuracy: A Real-World Evaluation and Implications for Machine Learning

Development and External Validation of the SANTANDER Score: A Primary Care Clinical Decision Tool for Cardiovascular Risk Stratification in Colombia

Diagnostic Value of Complete Blood Count-Derived Inflammatory Indices for Predicting Adverse Outcomes in Geriatric Patients Presenting to the Emergency Department with Acute Infectious Diarrhea in the MIMIC-IV Database