A Novel Differential Loss Function for Enhancing Generalization in Machine Learning Models

Eyas Gaffar A. Osman

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Generalization still represents one of the most prevalent obstacles in machine learning as models often overfit the training data and do not perform optimally over unseen samples. We propose a new loss function, Differential Generalization Loss (DGL), to improve the generalization of a model by adding a differential term that penalizes the steep changes of the test error when the model is being trained. DGL extends classical loss functions (like Cross-Entropy) with this dynamic regularization term, which stabilizes the training process and improves the robustness of the resulting model while, at the same time, shrinking the generalization gap. We extensively tested DGL on benchmark datasets such as MNIST, and CIFAR-10, and a relatively more complex dataset, Fashion-MNIST, with various neural network architectures like a Fully Connected Network and ResNet-18. We show through extensive experimental results that DGL outperforms standard Cross-Entropy loss as well as Cross-Entropy with most commonly employed regularization techniques (L2 weight decay, Dropout) by a substantial margin in terms of both average test accuracy after some epochs as well as generalization gap and much more visually justified smoother test error surfaces. These improvements are confirmed statistically through multiple experimental runs and t-tests. This finding places DGL as a strong and widely applicable form of dynamic regularization, serving as a new tool for enhanced model generalization in a wide range of machine learning areas. This work represents an important step forward in the design of loss functions embedded in the dynamics of test error, and it offers a new way of thinking about loss designs while also laying down a solid basis for a lot of future work exploring and using it.

Version published to 10.21203/rs.3.rs-6322741/v1 on Research Square
May 13, 2025

Lightweight Self-Supervised Representation Learning with Knowledge Distillation on Compact Datasets

This article has 1 author:
1. Khawla Hussein ِAli
This article has no evaluationsLatest version Jun 25, 2025
Ensemble of Neural Networks Augmented with Noise Elimination

This article has 4 authors:
1. Chapala Maharan
2. Ch Sanjeev Kumar Dash
3. Ajit Kumar Behera
4. Satchidananda Dehuri
This article has no evaluationsLatest version May 2, 2025
WITHDRAWN

This article has no evaluationsLatest version Jun 10, 2025

Listed in

Abstract

Article activity feed

Related articles

Lightweight Self-Supervised Representation Learning with Knowledge Distillation on Compact Datasets

Ensemble of Neural Networks Augmented with Noise Elimination

WITHDRAWN