Residual Permutation Tests for Feature Importance in Machine Learning
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Psychological research has traditionally relied on linear models to test scientifichypotheses. However, the emergence of machine learning (ML) algorithms has opened new opportunities for exploring variable relationships beyond linear constraints. To interpret the outcomes of these "black-box" algorithms, various tools for assessing feature importance have been developed. However, most of these tools are descriptive and do not facilitate statistical inference. To address this gap, our study introduces two versions of residual permutation tests (RPTs), designed to assess the significance of a target feature in predicting the label. The first variant, RPT on Y (RPT-Y ), permutesthe residuals of the label conditioned on features other than the target. The second variant, RPT on X (RPT-X), permutes the residuals of the target feature conditioned on the other features. Our simulation study demonstrates that RPT-X effectively maintains empirical Type I error rates within acceptable bounds and exhibits appreciable power in both regression and classification tasks. These findings suggest the utility of RPT-X for hypothesis testing in ML applications.