Modeling Shortcut Deviation in Structured Representation Space for Reliable Neural Prediction

James L. McAllister
Ayesha Zhang
David O. Karim
Lucia B. White

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Shortcut learning often leads to representation shift in neural networks, undermining their generalization capabilities. To address this issue, we propose a novel framework—Feature Deviation in Structured Representation (FDSR)—which explicitly models the deviation induced by shortcut functions within the structured representation space. By employing feature–function response mapping, FDSR isolates sub-representations influenced by spurious shortcut paths and accurately characterizes their interference with model predictions. On bias-controlled benchmarks such as Waterbirds and HANS, we develop an information-theoretic metric, the Feature Deviation Score (FDS), and further introduce a Task-Specific Manifold Projection method to reconstruct task-aligned representation spaces. Empirical results demonstrate a strong negative correlation between FDS and prediction accuracy on textual tasks (Pearson’s r = –0.74, p < 0.001), validating FDS as a reliable measure of shortcut-induced representation shift. In image classification tasks, models trained with the proposed FDSR framework achieve an average improvement of 13.2% ± 2.1% in out-of-distribution (OOD) accuracy. These findings provide both theoretical insights and practical tools for mitigating shortcut learning and enhancing the robustness of neural models

Version published to 10.21203/rs.3.rs-7455874/v1 on Research Square
Aug 26, 2025

What Makes Neural Networks Trainable? Invexity as a Structural Design Principle in AI

This article has 4 authors:
1. Samuel Pinilla
2. Ana Sanabria
3. Jia Bi
4. Karen Egiazarian
This article has no evaluationsLatest version Aug 4, 2025
Relational Pretraining for the Next Generation of Graph Intelligence

This article has 3 authors:
1. Alessandro Rossi
2. Francesca Bianchi
3. Giancarlo Manuele
This article has no evaluationsLatest version Jul 21, 2025
A Graph Neural Network for the Era of Large Atomistic Models

This article has 14 authors:
1. Han Wang
2. Duo Zhang
3. Anyang peng
4. Chun Cai
5. Wentao Li
6. Yuanchang Zhou
7. Jinzhe Zeng
8. Mingyu Guo
9. Chengqian Zhang
10. Bowen Li
11. Hong Jiang
12. Tong Zhu
13. Weile Jia
14. Linfeng Zhang
This article has no evaluationsLatest version Aug 22, 2025

Listed in

Abstract

Article activity feed

Related articles

What Makes Neural Networks Trainable? Invexity as a Structural Design Principle in AI

Relational Pretraining for the Next Generation of Graph Intelligence

A Graph Neural Network for the Era of Large Atomistic Models