How many is enough? Sample Size in Staggered Difference-in-Differences Designs
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We use power simulations to examine when difference-in-differences techniques for staggered treatments provide informative estimates. Using data on GDP in US States, we show that effects of more than 2.5% are necessary before the statistical tests achieve 80% power. Conditional on statistical significance, when interventions generate a 2.5% effect, difference-in-differences techniques overestimate the true effect by close to 20%-50%, on average. We use data on publicly traded firms to investigate how the necessary sample size differs by context. More than 500-1,000 firms are needed to study interventions generating a 10% increase in revenue. Finally, we examine how power can be improved through simple transformations that reduce the noise and autocorrelation of the data. Our paper shows that: i) surprisingly large sample sizes may be required to study meaningful interventions, ii) power simulations will reveal which interventions can be studied with the data at hand, and iii) power may be better for alternative measures of similar concepts.