Predicting Chronic Kidney Disease Risk Factors Using Machine Learning: Using the 2021 Korea National Health and Nutrition Examination Survey

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background : The prevalence of chronic kidney disease (CKD) in Korea increases annually. With the rapidly aging population in Korea, the number of patients with CKD is expected to increase further. CKD imposes a significant burden on both individuals and the country. However, due to the lack of awareness of CKD, most patients are diagnosed in the end stage of CKD. Therefore, this study aims to develop a machine learning model for CKD to identify at-risk patients, slow disease progression, and prevent complications. Methods : Based on the Rainbow model, 61 variables were considered explanatory variables. Among the adult and elderly, 197 (5.1%) of 3,868 participants and 135 (11.1%) of 1,216 participants were classified as having CKD, respectively. Six machine learning methods were used to explore risk factors for CKD and identify the model with the highest performance power. Logistic regression analysis was used to confirm the importance of key variables in each selected machine-learning model. Results : In adults, the boosting method demonstrated the highest predictive power for CKD (accuracy, 0.974; precision, 0.975; recall, 0.974; F1 score, 0.968; AUC, 0.886). Analysis of the elderly population revealed the Naïve Bayes model as the most effective for predicting CKD (accuracy, 0.905; precision, 0.922; recall, 0.905; F1 score, 0.912; AUC, 0.744). Logistic regression analysis reaffirmed the risk factors identified. In adults, these included urine protein, age, private insurance, hypertension, diabetes, residential area, and anemia. In the elderly, they included urine protein, anemia, age, diabetes, private insurance, moderate to high physical activity levels, household composition, sex, number of elderly leisure welfare facilities per 1,000 people, and subjective health perception. Conclusion : The machine learning risk model for CKD developed in this study serves as a foundation for the formulation of nursing plans, the establishment of early warning systems through prediction, and the development of nursing guidelines in nursing practice.

Article activity feed