A machine learning model for prediction of early-onset neonatal sepsis in low- and middle-income countries: Development and validation study

Deepika Kainth
Ayushi Gupta
Pradeep Singh
Satya Prakash
Anu Thukral
Ashok Deorari
Mudit Kapoor
Ramesh Agarwal
Tavpritesh Sethi
M Jeeva Sankar

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objective

Early-onset sepsis (EOS), which occurs within the first 72 hours of life, can often be fatal for neonates. Machine learning (ML) models demonstrate promise for timely diagnosis. However, current ML models primarily rely on data from high-income countries, which reduces their applicability to low- and middle-income countries (LMICs) that have a higher burden and different disease profiles. We developed an ML model for the timely prediction of culture-proven EOS in LMICs.

Methods

We conducted a secondary analysis of the Delhi Neonatal Infection Study (2011-2014) carried out in three level-3 neonatal units in India. We extracted data for inborn neonates suspected of having EOS and excluded cases of culture-negative sepsis. By implementing a dynamic 80:20 (train:test) data split, we employed two feature selection methods—Boruta and Lasso—across 64 variables and applied five machine learning techniques. We aimed to achieve 90% sensitivity to identify the optimal model based on performance metrics. The developed model was integrated into a web application and validated in an external cohort of neonates born between 2015 and 2021.

Results

Of 2,924 neonates, 548 (18.7%) had culture-proven sepsis. The mean gestation and birth weight were 35.3 (±3.8) weeks and 2,112 (±754) g, respectively. The Boruta and random forest classifier yielded the best model, which included 28 perinatal-neonatal variables. The sensitivity and specificity of the model were 90.3% and 40.6%, respectively. In external validation (n=147; 26 culture-proven sepsis cases), the model’s sensitivity, specificity, positive predictive value, and negative predictive value were 92.3%, 37.2%, 24.0%, and 95.7%, respectively. The sensitivity was 100% in asymptomatic neonates with only perinatal risk factors for EOS. Using the model could have reduced antibiotic usage from 74.8% to 55.7% (risk difference: -19.1%; 95% CI: -8.3 to -29.7).

Conclusions

The ML model demonstrated high sensitivity and acceptable specificity in predicting early-onset sepsis. This prediction model has the potential to assist in the timely and reliable identification of culture-positive sepsis and may serve as a bedside decision support tool in LMICs.

What is already known on this topic?

Machine-learning models display a good predictive performance for neonatal sepsis prediction.
Existing models, developed using data from high-income countries, have concerns regarding their generalisability and have not been externally validated.

What does this study add?

We developed and externally validated a prediction algorithm using a large dataset, prospectively collected variables, and machine learning techniques to predict early-onset neonatal sepsis.
The model displayed 90.3% sensitivity and 40.6% specificity.

How this study might affect research, practice or policy?

Our model, incorporated as a computer-based application, can be an excellent clinical aid for enhancing the clinician’s prediction of neonatal sepsis in low- and middle-income countries.

Version published to 10.1101/2025.09.20.25335989 on medRxiv
Sep 29, 2025

Development and Validation of a Machine Learning-Based Risk Prediction Model for PICC-Related Bloodstream Infections in Premature Infants Using SHAP Interpretability

This article has 5 authors:
1. Yongqin Guo
2. Yingying Dou
3. Wenxia Song
4. Lihong Wang
5. Li Wang
This article has no evaluationsLatest version Dec 30, 2025
Development and validation of an Explainable Machine Learning Model for Predicting Multiple Organ Failure in Patients with Acute Pancreatitis: a Multicenter Cohort Study

This article has 7 authors:
1. Yi Hao
2. Peiyi Bai
3. Yunpeng Zhou
4. Yi Wang
5. Qinyang Du
6. Rongshen Guan
7. Gaopeng Li
This article has no evaluationsLatest version Dec 22, 2025
Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust

This article has 6 authors:
1. Thiago Q. Oliveira
2. Leandro A. Carvalho
3. Flávio R. C. Sousa
4. João B. F. Filho
5. Khalil F. Oliveira
6. Daniel A. B. Tavares
This article has no evaluationsLatest version Jan 30, 2026

Discuss this preprint

Listed in

Abstract

Objective

Methods

Results

Conclusions

What is already known on this topic?

What does this study add?

How this study might affect research, practice or policy?

Article activity feed

Related articles

Development and Validation of a Machine Learning-Based Risk Prediction Model for PICC-Related Bloodstream Infections in Premature Infants Using SHAP Interpretability

Development and validation of an Explainable Machine Learning Model for Predicting Multiple Organ Failure in Patients with Acute Pancreatitis: a Multicenter Cohort Study

Responsible AI for Sepsis Prediction: Bridging the Gap Between Machine Learning Performance and Clinical Trust