Analysis of Premature Mortality from Cardio-Cerebrovascular Diseases in Bogotá (2010–2022): An Analytical and Classification-Based Machine Learning Approach
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Premature mortality from cardio-cerebrovascular diseases represents an increasing burden on health systems, particularly in urban contexts across Latin America. This study analyzes mortality records in Bogotá from 2010–2022 via descriptive analysis, time series, and machine learning models. It includes deaths among individuals aged over 30, classified as premature or nonpremature based on a 75-year threshold. Supervised models were trained using sociodemographic, insurance-related, and underlying cause-of-death variables, and their performance was evaluated via standard metrics. The random forest model showed the best overall performance, with educational level, insurance scheme, and place of death emerging as the main predictors. Additionally, separate models were developed for diagnostic groups (ischemic, cerebrovascular, hypertensive, and heart failure) and revealed differences in classification patterns. Trend analysis revealed a sustained increase in premature mortality, which increased during the pandemic period. These findings underscore the role of social determinants in premature cardiovascular deaths and highlight the potential of machine learning as a decision-support tool for public health.