Estimating causal effects of rare treatments on binary outcomes: Addressing sample size requirements and bias correction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background: Estimating causal effects of an exposure (treatment) in epidemiological observational studies has received great interest in the last decade. Of the several methods in the causal inference literature, inverse probability weighting (IPW) is a widely used method to estimate the causal effect of an exposure on the outcome of interest. IPW estimators, derived from estimated propensity scores (PS) for binary exposure, are unbiased and consistent under some identifiability conditions. However, the properties of IPW estimators have not been extensively studied for rare binary exposures, specifically in settings with low events per variable (EPV) in the PS model, where the EPV is often used to quantify the rareness of a binary exposure. Methods: Given the EPV value related to sample size and exposure prevalence, this study investigates the performance of the IPW estimator under varying EPV conditions through simulations, aiming to (i) determine the minimum EPV required for obtaining a consistent IPW estimator of the average causal effect (IPW-ATE) and(ii) evaluate bootstrap-based bias correction for low EPV cases. Two separate simulation series (one for each case) were conducted under various simulation scenarios created by varying the outcome prevalence, sample size, number of baseline covariates, and degree of PS model misspecification. Results: The simulation results show that the IPW-ATE estimator exhibits bias when EVP$<8$, with a greater bias under misspecified PS models. This bias systematically decreases with increasing EPV, becoming negligible at EPV$\ge8$. For a low EPV value, bootstrap bias correction effectively reduces bias, yielding nearly unbiased estimates. The method is applied to estimate the causal effect of low birth weight on malnutrition in Bangladeshi children under five, adjusting for the treatment selection bias induced by several baseline covariates. The empirical findings were consistent with the simulation findings. Conclusion: Based on the findings, some practical recommendations are discussed for researchers to help ensure reliable causal inference in studies with rare treatments where conventional IPW methods may fail.