Construction of Prediction Model for Severe Pneumonia in Children Based on Machine Learning

Shuai Yu
Zhengfeng Xue
Qing Liu
Yaya Ren
Xue Zhou
Yuanxia Li

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background：Pneumonia stands as the primary infectious disease and leading cause of mortality in children under 5 years old globally. The vulnerability to delayed diagnosis and treatment of severe pneumonia (SP) in children arises from the underdeveloped respiratory and immune systems, coupled with challenges in symptom expression. Consequently, SP can instigate diverse systemic complications, posing a grave threat to children's well-being and escalating societal and economic burdens. Presently, clinical tools for assessing pneumonia severity in children exhibit limitations in sensitivity, specificity, and inter-observer consistency. Furthermore, artificial intelligence research in pediatric pneumonia significantly trails behind advancements in adult pneumonia. Objective: This study aimed to develop a machine learning-based prediction model for early identification and intervention of severe pneumonia in children to support clinical decision-making. Methods: A retrospective analysis was conducted on 360 pneumonia cases admitted to the Affiliated Hospital of Yan'an University between August 2023 and August 2024. The cases were categorized into severe (n=160) and mild (n=200) groups based on disease severity. Independent risk factors were identified through univariate and multivariate logistic regression analyses. Seven machine learning algorithms, including CatBoost, XGBoost, LightGBM, SVM, KNN, LR, and GNB, were employed to construct the prediction model. The dataset was randomly split into training (70%) and test (30%) sets for model development and evaluation. Model performance metrics such as accuracy, precision, recall, F1 score, and area under the receiver operating characteristic curve (AUC) were used. SHAP values were utilized for model interpretation and visualization of the optimal model. Results: Multivariate analysis identified fever days, abdominal pain, elevated CRP, elevated plasma D-dimer, and plastic sputum thrombus formation as independent risk factors for severe pneumonia in children (all P<.05). Among the seven machine learning models assessed, Support Vector Machine (SVM) demonstrated superior performance on the test set, achieving an AUC value of .906, accuracy of .843, and F1 score of .817. SHAP analysis indicated that the number of days with fever was the most influential feature for model predictions, followed by D-dimer and CRP levels. Conclusion: The SVM machine learning model, utilizing fever, abdominal pain, CRP, D-dimer, and plastic sputum thrombus, effectively predicts the risk of severe pneumonia in children. Furthermore, the model exhibits good interpretability through the SHAP framework, facilitating the early identification of high-risk children. However, further validation of the model is warranted using multi-center, large-sample external data.

Version published to 10.21203/rs.3.rs-7575261/v1 on Research Square
Apr 7, 2026

Construction and Validation of an Interpretable Machine Learning Model with SHAP for Identifying Infectious Diseases in Fever of Unknown Origin

This article has 5 authors:
1. Fei Li
2. Xu Zhang
3. Juan Zhang
4. Yang Yu
5. Jie Yang
This article has no evaluationsLatest version Apr 9, 2026
Dynamic Landmark-Based Prediction of Sepsis Using Interpretable and Balanced Machine Learning Models in Respiratory-Supported Critically ill Patients

This article has 7 authors:
1. Ayao Sangenis Assogba
2. Jennifer H. Gladius
3. Komi Selassi Gayi
4. Samadou Tchakondo
5. Yendouname Kandjoni
6. Richard Sagacity Tugbeh
7. Rachana Das
This article has no evaluationsLatest version Mar 25, 2026
Comparative Analysis of Deep Learning and Machine Learning Models for Early Prediction of Chronic Kidney Disease

This article has 3 authors:
1. Debabrata Maity
2. Subahsish Banerjee
3. Arnab Bandyopadhyay
This article has no evaluationsLatest version Apr 15, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Construction and Validation of an Interpretable Machine Learning Model with SHAP for Identifying Infectious Diseases in Fever of Unknown Origin

Dynamic Landmark-Based Prediction of Sepsis Using Interpretable and Balanced Machine Learning Models in Respiratory-Supported Critically ill Patients

Comparative Analysis of Deep Learning and Machine Learning Models for Early Prediction of Chronic Kidney Disease