Discrete Weight Neural Networks: Investigating the Relationship Between Weight Precision and Generalization

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Biological synapses transmit signals discretely and with noise, yet biological learners often generalize from few examples. Motivated by this contrast, we study how constraining neural-network weights to discrete grids affects fit and generalization in a controlled rule-learning task. We compare standard float32 training with binary, ternary, and small-integer weight constraints using straight-through estimators (STE), and we include a fine-grained fixed-point grid (Q16.16). We also evaluate a simple pure-integer coordinate-descent baseline to isolate optimization effects when updates are restricted to integer steps. On a 5×5 relational sum-comparison task, coarse discretization substantially reduces the train–test gap but also degrades attainable accuracy under our optimization procedures, indicating that reduced overfitting often coincides with underfitting. In contrast, Q16.16 fixed-point training preserves learnability and, in some settings, matches or exceeds our float32 baseline (e.g., 84% vs. 78% test accuracy at n = 500 in one configuration). We discuss these results in the context of prior work on quantization as regularization and on integer-only training, and we highlight optimization—rather than representational capacity—as the primary bottleneck for very low-bit weights in this setting.

Article activity feed