To Interact or not to Interact: Pros and Cons of Including Interactions in Linear Regression Models

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Interaction effects are very common in the psychological literature. However, interaction effectsare typically very small and often fail to replicate. In this study, we conducted a simulationcomparing the generalizability and estimability of two linear regression models: one correctlyspecified to account for interaction effects and one misspecified including simple effects only. Wemanipulated noise levels, predictor variable correlations, and different sets of regression weights,resulting in 9216 different conditions. From each dataset we drew 1000 samples of N = 25, 50,100, 250, 500 and 1000 resulting in a total of 55.296.000 analyses for each model. Our resultsshow that misspecification can drastically bias regression estimates, sometimes leading to zero orreversed simple effects. Furthermore, we found that when models are generalized to the entirepopulation, the difference between the explained variance in the sample and in the population isoften smaller for the misspecified model than for the correctly specified model. However, thecomparison between models shows that the correctly specified model explains the data atpopulation level better overall. These results emphasize the importance of theory in modelingchoices and show that it is important to provide a rationale for why interactions are included orexcluded in an analysis.

Article activity feed