An Information-Theoretic Approach to Optimal Training Set Construction for Neural Networks

Kalin Stoyanov

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We present cEntMax, an information-theoretic framework for training set optimization that selects classwise informative samples via cross-entropy divergence to prototype pivots. Under a noisy-channel generative view and local linearity of deep networks, the method connects predictive entropy, Fisher information, and G-optimal coverage. Experiments on EMNIST and KMNIST show faster convergence, lower validation loss, and greater stability than random sampling, especially for moderate sampling fractions.

Version published to 10.20944/preprints202601.1392.v1
Jan 20, 2026

Task-Guided Quantization Strategies

This article has 1 author:
1. Igor Szoboszlai
This article has no evaluationsLatest version Dec 23, 2025
ZENITH: Automated Gradient Norm Informed Stochastic Optimization

This article has 1 author:
1. Dhrubo Saha
This article has no evaluationsLatest version Jan 22, 2026
Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces

This article has 2 authors:
1. Nianze Tao
2. Minori Abe
This article has no evaluationsLatest version Jan 21, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Task-Guided Quantization Strategies

ZENITH: Automated Gradient Norm Informed Stochastic Optimization

Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces