Mapping Neighborhood-Level Drivers of Type 2 Diabetes: A Predictive-Causal Approach for Precision Public Health
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Type 2 diabetes has become an urban epidemic influenced by neighbourhood environments. However, conventional risk models focusing solely on individual factors fail to account for these community influences and often require detailed patient data that may not be available. To address this gap, we developed an integrated approach combining machine learning and causal inference to map type 2 diabetes risk at the community level. Using demographic, health, and socioeconomic data from 1,149 Census Tracts (CTs) in a large metropolitan region, we trained seven machine learning models to identify neighbourhoods with high diabetes prevalence. Although neighbourhood-level diabetes data were available for this study area, our model’s high predictive accuracy on external validation data (area under the curve (AUC) = 0.95), particularly from a distinct geographical region, demonstrates its potential utility in predicting diabetes risk for other regions in Canada or elsewhere where such data are unavailable. The top models achieved high recall (> 90%) and AUC up to 0.96 on test data, indicating accurate identification of high-risk neighbourhoods with few false positives. Survey-derived community health indicators, including obesity rate, physical inactivity, and median age, were strong predictors of diabetes prevalence. We then applied a Causal Forest approach to estimate the impact (Conditional Average Treatment Effect, τ) of modifiable factors. Higher work stress (τ= 0.312) and daily smoking (τ= 0.155) were moderately associated with increased risk, whereas better mental health (τ≈−1.1) was protective, highlighting mental health as a critical intervention priority, especially in neighbourhoods predicted to have high diabetes prevalence. These findings illustrate how community-level factors can guide targeted interventions and advance health equity, particularly for immigrant and visible-minority populations. Our integrated machine-learning and causal framework lays the groundwork for precision public health, demonstrating how modifiable neighbourhood factors can indicate diabetes risk when patient-level data are scarce. Furthermore, our methodology is adaptable to other chronic diseases influenced by social and environmental determinants, potentially guiding targeted prevention efforts beyond type 2 diabetes.