Scaffolded representation learning in deep networks

Philipp Stecher
Sandro Radovanović
Vlasta Sikimić
Reinhard Kahle

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Deep networks learn coarse structure before fine-grained distinctions, yet whether coarse structure actively scaffolds later differentiation remains untested. Here we show that representations assemble through a load-bearing scaffold. Tracking features at per-sample resolution across 55 runs, three architecture families and two training datasets, we find a reproducible three-phase program: task-general features emerge and dominate first, superclass groupings form next, and class-level distinctions develop last. Selectively corrupting superclass boundaries impairs later differentiation, suggesting that fine-grained learning depends on the coherence of coarser representations. Conversely, a curriculum that pre-builds the scaffold reduces differentiation cost 6.7-fold while nearly preserving accuracy and halving overfitting. These findings connect critical learning periods, neural collapse, progressive differentiation, the lottery ticket hypotheses, and catastrophic forgetting within a single developmental account and provide training diagnostic insights relevant for curriculum design, transfer timing, and mechanistic interpretability.

Version published to 10.21203/rs.3.rs-9269961/v1 on Research Square
Apr 16, 2026

EPMORE: Explainable Process Mixture-of-Experts

This article has 7 authors:
1. Wei Sheng
2. Chengzhu Xiao
3. Lunhao Ao
4. Junyan Long
5. Ye Yu
6. Yangguang Jia
7. Qihua Zhang
This article has no evaluationsLatest version Mar 24, 2026
reEtym: A Natively Feature-Disentangled Transformer for Interpretability

This article has 1 author:
1. Hongyu Shi
This article has no evaluationsLatest version Apr 15, 2026
Deep Transferable Label Propagation with Prototypical Augmentation

This article has 3 authors:
1. Yufang Dan
2. Li Zhu
3. Di Zhou
This article has no evaluationsLatest version Apr 15, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

EPMORE: Explainable Process Mixture-of-Experts

reEtym: A Natively Feature-Disentangled Transformer for Interpretability

Deep Transferable Label Propagation with Prototypical Augmentation