Epistemic Frontiers: Distinguishing Causality, Information, and Predictability in Pattern Recognition

Pablo Neirz
Héctor Allende
Carolina Saavedra

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

High predictive accuracy is frequently misinterpreted as evidence of causal understanding or population-level signal. Models can exploit spurious correlations, confounding, or protocol-induced artefacts, while post-hoc explanations may faithfully describe model behaviour yet remain misleading about the underlying phenomenon. We propose a framework that separates three layers of evidence: (i)~causal relations in the phenomenon, (ii)~population-level statistical dependence, and (iii)~finite-sample, protocol-dependent predictive effects. This separation clarifies why predictive success and feature attributions do not license mechanistic interpretations without additional assumptions. Under log-loss and Bayes-risk-consistent protocols, the population predictive value of adding a feature equals the conditional mutual information, providing a principled reference for ''true signal.'' Using controlled simulations, we illustrate that bootstrap resampling can create false positives by amplifying chance correlations, and that SHAP can assign high importance to confounded variables while remaining faithful to the fitted model. These results suggest that ''feature importance'' is best treated as protocol-bounded evidence, and that interpretation benefits from reporting the protocol, robustness checks, and the intended inferential scope.

Version published to 10.21203/rs.3.rs-8712176/v1 on Research Square
Feb 20, 2026

Manipulating Prior Causal Beliefs Through Causal Mechanism Information Affects the Outcome-Density Bias

This article has 2 authors:
1. Simon Stephan
2. Michael R. Waldmann
This article has no evaluationsLatest version Jan 8, 2026
WITHDRAWN: Masking Drives Shared Representations in Predictive Networks

This article has 1 author:
1. Shamim Khaliq
This article has no evaluationsLatest version Feb 19, 2026
Statistical Properties and Power Analysis of Divergence Measures for Credit Risk Model Monitoring

This article has 2 authors:
1. Abdullah karasan
2. Alper hekimoglu
This article has no evaluationsLatest version Feb 23, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Manipulating Prior Causal Beliefs Through Causal Mechanism Information Affects the Outcome-Density Bias

WITHDRAWN: Masking Drives Shared Representations in Predictive Networks

Statistical Properties and Power Analysis of Divergence Measures for Credit Risk Model Monitoring