Construction of Prediction Model for Severe Pneumonia in Children Based on Machine Learning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background:Pneumonia stands as the primary infectious disease and leading cause of mortality in children under 5 years old globally. The vulnerability to delayed diagnosis and treatment of severe pneumonia (SP) in children arises from the underdeveloped respiratory and immune systems, coupled with challenges in symptom expression. Consequently, SP can instigate diverse systemic complications, posing a grave threat to children's well-being and escalating societal and economic burdens. Presently, clinical tools for assessing pneumonia severity in children exhibit limitations in sensitivity, specificity, and inter-observer consistency. Furthermore, artificial intelligence research in pediatric pneumonia significantly trails behind advancements in adult pneumonia. Objective: This study aimed to develop a machine learning-based prediction model for early identification and intervention of severe pneumonia in children to support clinical decision-making. Methods: A retrospective analysis was conducted on 360 pneumonia cases admitted to the Affiliated Hospital of Yan'an University between August 2023 and August 2024. The cases were categorized into severe (n=160) and mild (n=200) groups based on disease severity. Independent risk factors were identified through univariate and multivariate logistic regression analyses. Seven machine learning algorithms, including CatBoost, XGBoost, LightGBM, SVM, KNN, LR, and GNB, were employed to construct the prediction model. The dataset was randomly split into training (70%) and test (30%) sets for model development and evaluation. Model performance metrics such as accuracy, precision, recall, F1 score, and area under the receiver operating characteristic curve (AUC) were used. SHAP values were utilized for model interpretation and visualization of the optimal model. Results: Multivariate analysis identified fever days, abdominal pain, elevated CRP, elevated plasma D-dimer, and plastic sputum thrombus formation as independent risk factors for severe pneumonia in children (all P<.05). Among the seven machine learning models assessed, Support Vector Machine (SVM) demonstrated superior performance on the test set, achieving an AUC value of .906, accuracy of .843, and F1 score of .817. SHAP analysis indicated that the number of days with fever was the most influential feature for model predictions, followed by D-dimer and CRP levels. Conclusion: The SVM machine learning model, utilizing fever, abdominal pain, CRP, D-dimer, and plastic sputum thrombus, effectively predicts the risk of severe pneumonia in children. Furthermore, the model exhibits good interpretability through the SHAP framework, facilitating the early identification of high-risk children. However, further validation of the model is warranted using multi-center, large-sample external data.

Article activity feed