A Causal Data Science Framework for Educational Displacement Under Extreme Resource Scarcity: Simulation-Based Evidence from Gaza (2023–2026)

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

AbstractEducational disruption in conflict-affected regions is often quantified through descriptive statistics, yet rarely analysed through causal lenses that account for the sequential nature of household decisions under survival constraints. This study introduces a causal data science framework that combines causal inference with machine learning to estimate the causal effect of resource-based interventions on school attendance in closed-system scarcity environments. Using secondary data from United Nations agencies, the World Bank, and peer-reviewed literature (2023–2026), we construct a synthetic population that replicates the demographic, nutritional, and water-access conditions of the Gaza Strip. The framework estimates heterogeneous treatment effects through a two-stage procedure: first, inverse probability weighting adjusts for observed confounders; second, double machine learning with gradient boosting and causal forests captures non-linear interactions and effect heterogeneity. Policy implications are derived from optimal policy trees that partition households into subgroups with distinct intervention recommendations. Results indicate that decentralised water access increases attendance by an average of 32.1 percentage points, with gains reaching 38–45 percentage points among households initially spending more than five hours on daily survival labour. Nutritional supplementation alone yields a smaller but significant average gain of 11.3 percentage points, primarily through cognitive recovery. Critically, the two interventions are complementary: a formal interaction analysis reveals a synergistic effect of 12.4 percentage points ( p < 0.001), such that combined water–nutrition packages generate substantially larger gains than either intervention alone. Policy trees recommend water interventions for high‑labour households and combined water–nutrition packages for those with elevated physiological penalty scores. All causal estimates pass refutation tests (random common cause, placebo treatment, data subset), confirming robustness. By relying exclusively on secondary data and simulation, the framework operates without requiring primary data collection or direct human subject involvement, thereby avoiding the logistical and institutional review complexities of fieldwork in active conflict zones. The methodology is readily transferable to other humanitarian settings where secondary data are available.

Article activity feed