A Globally Optimal Alternative to MLP

Zheng Li
Jerry Cheng
Huanying Gu

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In deep learning, achieving the global minimum poses a significant challenge, even for relatively simple architectures such as Multi-Layer Perceptrons (MLPs). To address this challenge, we visualized model states at both local and global optima, thereby identifying the factors that impede the transition of models from local to global minima when employing conventional model training methodologies. Based on these insights, we propose the Lagrange Regressor (LReg), a framework that is mathematically equivalent to MLPs. Rather than updates via optimization techniques, LReg employs a mesh-refinement-coarsening (discrete) process to ensure the convergence of the model’s loss function to the global minimum. LReg achieves faster convergence and overcomes the inherent limitations of neural networks in fitting multi-frequency functions. Experiments conducted on large-scale benchmarks including ImageNet-1K (image classification), GLUE (natural language understanding), and WikiText (language modeling) show that LReg consistently enhances the performance of pre-trained models, significantly lowers test loss, and scales effectively to big data scenarios. These results underscore LReg’s potential as a scalable, optimization-free alternative for deep learning in large and complex datasets, aligning closely with the goals of innovative big data analytics.

Version published to 10.20944/preprints202506.1335.v1
Jun 16, 2025

A Novel Differential Loss Function for Enhancing Generalization in Machine Learning Models

This article has 1 author:
1. Eyas Gaffar A. Osman
This article has no evaluationsLatest version May 13, 2025
Deep Learning 2.0.1: Mind and Cosmos - Towards Cosmos-Inspired Interpretable Neural Networks

This article has 1 author:
1. Taha Bouhsine
This article has no evaluationsLatest version Jun 25, 2025
Evaluating the perception, understanding, and forgetting of Progressive Neural Networks: a quantitative and qualitative analysis

This article has 3 authors:
1. Lucía Güitta-López
2. Jaime Boal
3. Álvaro Jesús López-López
This article has no evaluationsLatest version May 15, 2025

Listed in

Abstract

Article activity feed

Related articles

A Novel Differential Loss Function for Enhancing Generalization in Machine Learning Models

Deep Learning 2.0.1: Mind and Cosmos - Towards Cosmos-Inspired Interpretable Neural Networks

Evaluating the perception, understanding, and forgetting of Progressive Neural Networks: a quantitative and qualitative analysis