A Novel Differential Loss Function for Enhancing Generalization in Machine Learning Models

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Generalization still represents one of the most prevalent obstacles in machine learning as models often overfit the training data and do not perform optimally over unseen samples. We propose a new loss function, Differential Generalization Loss (DGL), to improve the generalization of a model by adding a differential term that penalizes the steep changes of the test error when the model is being trained. DGL extends classical loss functions (like Cross-Entropy) with this dynamic regularization term, which stabilizes the training process and improves the robustness of the resulting model while, at the same time, shrinking the generalization gap. We extensively tested DGL on benchmark datasets such as MNIST, and CIFAR-10, and a relatively more complex dataset, Fashion-MNIST, with various neural network architectures like a Fully Connected Network and ResNet-18. We show through extensive experimental results that DGL outperforms standard Cross-Entropy loss as well as Cross-Entropy with most commonly employed regularization techniques (L2 weight decay, Dropout) by a substantial margin in terms of both average test accuracy after some epochs as well as generalization gap and much more visually justified smoother test error surfaces. These improvements are confirmed statistically through multiple experimental runs and t-tests. This finding places DGL as a strong and widely applicable form of dynamic regularization, serving as a new tool for enhanced model generalization in a wide range of machine learning areas. This work represents an important step forward in the design of loss functions embedded in the dynamics of test error, and it offers a new way of thinking about loss designs while also laying down a solid basis for a lot of future work exploring and using it.

Article activity feed