Causal Pitfalls of Feature Attributions in Financial Machine Learning Models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The increasing deployment of complex machine learning models in high-stakes financial applications, such as asset pricing, credit risk assessment, and fraud detection, has heightened the demand for model explainability. Feature attribution methods are commonly employed to provide insights into model decisions by assigning importance scores to input features. However, these attributions are often misconstrued as indicators of true causal influence, a leap that can be particularly perilous in finance where spurious correlations and confounding variables are prevalent. This paper investigates the causal faithfulness of various feature attribution methods using finance-motivated synthetic data-generating processes with known causal ground truths for asset pricing, credit risk, and fraud detection. We train Multilayer Perceptrons, LSTMs, and XGBoost models and evaluate the ability of attribution methods (including SHAP, Integrated Gradients, and XGBoost's built-in importance) to identify causal features. Our experiments show that XGBoost's built-in feature importance provides the most causally faithful explanations overall, and XGBoost models generally yield more causally accurate attributions across scenarios. Asset pricing was the best-performing scenario for causal feature identification based on overall faithfulness scores with top methods. However, fraud detection presents unique challenges, where indirect indicators (consequences of fraud) often receive substantial attribution weight despite not being causal drivers. Significant variability in performance across methods suggests using multiple techniques for robust understanding. These findings underscore the critical need for practitioners to exercise caution when interpreting feature attributions as causal explanations, emphasizing the necessity of domain expertise and proper validation.