Unintended Pregnancy and Preterm Birth in the United States: Causal Inference and Risk Prediction Using National Survey of Family Growth Data

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Unintended pregnancy remains common in high income countries and has been linked to poorer maternal and neonatal outcomes. Whether pregnancy intention has an independent, causal effect on preterm birth, beyond social and clinical risk factors, is uncertain.

Methods

We conducted a cross-sectional analysis of a nationally representative sample of singleton live births from a US reproductive health survey. Pregnancy intention (intended vs unintended) was reported at conception. Preterm birth was defined as delivery before 37 completed weeks. We used survey-weighted logistic regression and a suite of causal estimators, including inverse probability weighted marginal structural models, augmented inverse probability weighting, targeted maximum likelihood estimation with Super Learner, Bayesian g-computation, and causal forests. Models adjusted for maternal age, race and ethnicity, parity, marital or cohabiting status, education, poverty ratio, insurance, and body mass index. We also trained Super Learner prediction models with 10-fold cross validation and evaluated discrimination, calibration, high risk stratification, and net clinical benefit.

Findings

In the weighted population, 39.1% of pregnancies were unintended. Preterm birth occurred in 12.6% of unintended vs 9.0% of intended pregnancies. In survey-weighted logistic models, unintended pregnancy was associated with higher odds of preterm birth (adjusted odds ratio 1.43, 95% CI 1.06 to 1.94). Across advanced causal estimators, the risk difference for unintended vs intended pregnancy was small but consistent, around 3 excess preterm births per 100 live births, with limited positivity and modest E-values suggesting that unmeasured confounding could attenuate or explain part of the association. A Super Learner ensemble achieved excellent discrimination (area under the curve about 0.98 vs 0.56 for baseline logistic regression), good calibration, and identified a top 10% risk stratum with markedly higher observed preterm birth risk than the lower 90%.

Interpretation

In this national sample, unintended pregnancy functioned primarily as a marker of concentrated social and clinical vulnerability rather than a large, isolated causal driver of preterm birth. Nonetheless, pregnancy intention materially improved risk stratification when combined with standard covariates. Joint use of causal inference and machine learning provides a defensible framework to target intensified antenatal support to women at highest risk while avoiding overinterpretation of intention as a deterministic cause.

Funding

No external funding.

Article activity feed