Cervical cancer screening uptake and its associated factor in Sub-Shara Africa: a machine learning approach

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction: Cervical cancer, which includes squamous cell carcinoma and adenocarcinoma, is a leading cause of cancer-related deaths globally, particularly in low- and middle-income countries (LMICs). It is preventable through early screening, but incidence and mortality rates are significantly higher in LMICs, with 94% of deaths occurring in these regions. Poor implementation of screening programs, in addition to multiple health system barriers, leads to a high burden from cervical cancer in these countries. Projections show increasing cases and deaths due to the disease by 2030. Solving this will take strong policy leadership, comprehensive community-based advocacy, along with improved health systems to support screening and women's health. Method The secondary data for ten Sub-Saharan African countries were utilized from the Demographic and Health Survey, DHS, to evaluate cervical cancer screening among women aged 25–49 years. Cleaning and balancing were done in the preprocessing of data, and then missing values and outliers were removed before splitting into training and validation sets containing 89% and 20%, respectively. The following machine learning classification algorithms were used in the study: Logistic Regression, Decision Tree Classifier, Random Forest, K-Nearest Neighbor, Gradient Boosting, AdaBoost, and Extra Trees. These algorithms were employed to predict cervical cancer screening outcomes. The performance of the models was evaluated using accuracy, precision, recall, and F1 score. Result In this study, a cervical cancer screening behavior was predicted among 75,360 weighted samples of women from an African country, aged 25–49, where the Extra Trees Classifier obtained an accuracy of 94.13%, a precision of 95.76%, recall of 94.12%, F1-score of 93.80%. Then followed Random Forest: accuracy = 93.87, precision = 99.18%. Health visits, proximity to health care, using contraceptives, residing in urban settings, and exposure to media were its most crucial predictors. Class balancing by oversampling and tuning hyperparameters via grid search strengthened the models. The ensemble methods, such as Extra Trees and Random Forest, showed the best generalization, indicating that this work well on complex datasets and can help devise targeted intervention strategies. Conclusion This study demonstrates that the ensemble machine learning models, such as Extra Trees Classifier and Random Forest, are promising in predicting cervical cancer screening behavior among African women with accuracies of 94.13% and 93.87%, respectively. Key predictors include healthcare access, sociocultural factors, media exposure, residence in urban areas, and contraceptive use. The findings emphasize the need for a reduction in care barriers and the use of family planning visits and mass media in promoting screening. These results will be validated in different populations in order to find the clinical integration via decision support systems.

Article activity feed