Symmetry Breaking in Neural Network Optimization: Insights from Input Dimension Expansion
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Understanding how neural networks learn and optimize remains a central point in machine learning, with implications for designing better models. While techniques like dropout and batch normalization are widely used, the underlying principles driving their success—such as symmetry breaking, a concept in physics—are underexplored. We propose the symmetry breaking hypothesis, showing that breaking symmetries during training (e.g., via input expansion) substantially improves performance across tasks. We develop a metric to quantify symmetry breaking in networks, revealing its role in common optimization methods and its connection to properties like equivariance. This metric offers a practical tool to evaluate architectures without exhaustive training or full datasets, enabling more efficient design choices. Our work positions symmetry breaking as a unifying principle behind optimization techniques, bridging theoretical gaps and providing actionable insights for improving model efficiency.