Post-matching analysis after coarsened exact matching: implications of coarsening for residual confounding and model dependence

Fei Wan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Coarsened Exact Matching (CEM) is a widely used design strategy aimed at reducing confounding in observational studies by matching treated and control units within strata of coarsened covariates. It is often promoted as a method that mimics a randomized block design, which has led many researchers to apply simple, unadjusted statistical methods—such as paired t -tests or McNemar’s test—originally developed for blocked randomized designs. However, CEM only ensures balance on the coarsened scale, and residual imbalances may remain on the original covariate scale, raising questions about the appropriateness of unadjusted analyses as the primary analytic approach. Methods We examine the implications of this coarsening process for post-matching analysis using literature review, conceptual arguments, and simulation studies. In particular, we evaluate how within-stratum heterogeneity in the original covariates affects residual confounding and the dependence of treatment effect estimates on outcome model specification. Results Our results show that matching on coarsened covariates can leave systematic differences between treated and control subjects within matched strata. These differences introduce residual confounding that does not disappear with increasing sample size. Simulation results further demonstrate a bias–variance trade-off induced by coarsening: fine coarsening may reduce residual confounding but can result in substantial data loss, whereas coarse binning preserves sample size at the cost of increased bias and greater reliance on outcome model specification. Conclusions CEM should be regarded primarily as a preprocessing tool for improving covariate overlap rather than as a stand-alone solution for confounding control. Valid causal inference following CEM generally requires regression adjustment using the original, uncoarsened covariates, and unadjusted analyses of matched data may yield biased treatment effect estimates.

Version published to 10.21203/rs.3.rs-9076964/v1 on Research Square
Apr 3, 2026

Defining and Estimating Causal Effects in Randomized Alternating Treatment Design for Single-Case Experiments: A Counterfactual Approach

This article has 2 authors:
1. Wen Luo
2. Chendong Li
This article has no evaluationsLatest version Mar 31, 2026
Robust Inference of Individualized Treatment Effect in Mendelian Randomization

This article has 4 authors:
1. Muxuan Liang
2. Ruoxuan Wu
3. Feifei Xiao
4. Xiudi Li
This article has no evaluationsLatest version May 12, 2026
Estimation and inference for step-function selection models in meta-analysis with dependent effects

This article has 3 authors:
1. James E Pustejovsky
2. Martyna Citkowicz
3. Megha Joshi
This article has no evaluationsLatest version Apr 3, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Defining and Estimating Causal Effects in Randomized Alternating Treatment Design for Single-Case Experiments: A Counterfactual Approach

Robust Inference of Individualized Treatment Effect in Mendelian Randomization

Estimation and inference for step-function selection models in meta-analysis with dependent effects