A Unified Framework for Non-Convex Optimization in Deep Learning via Adaptive Variance Reduction

Yunyang Zhang
Xianyu Chen
Zizheng Zhang
Shen Zhou

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The increasing complexity of deep learning models necessitates sophisticated optimization techniques to effectively navigate their non-convex loss landscapes. However, traditional optimization methods, such as stochastic gradient descent (SGD), often encounter challenges related to high variance in gradient estimates, leading to suboptimal convergence and performance. This paper proposes a unified framework for adaptive variance reduction in stochastic non-convex optimization, addressing these critical issues. We introduce the Adaptive Variance Reduced Gradient (AVRG) algorithm, which dynamically balances variance reduction with computational efficiency, yielding improved convergence rates and robustness. Our framework synthesizes existing adaptive variance reduction methods, providing a cohesive theoretical understanding while filling existing gaps in the literature. Comprehensive empirical evaluations across diverse deep learning benchmarks demonstrate the efficacy of AVRG compared to established methods, highlighting its faster convergence and superior performance. By equipping researchers and practitioners with enhanced optimization strategies, this work contributes significantly to the field of deep learning, paving the way for future research in more complex optimization scenarios. We explore the theoretical guarantees of our approach, aiming to foster advancements in training methodologies for deep neural networks, thereby enabling better performance across a wide array of applications.

Version published to 10.21203/rs.3.rs-7250325/v1 on Research Square
Jul 31, 2025

Optimizing Deep Reinforcement Learning with Log Hyperbolic Cosine loss function: A Novel Approach

This article has 3 authors:
1. Aleeshbah Tehreem
2. Mirza Muhammad Ali Baig
3. Abdurrahman Javid Shaikh
This article has no evaluationsLatest version Jul 17, 2025
Joint Structure-Function Neural Architecture Optimization under Resource Constraints

This article has 5 authors:
1. Lianhua Wang
2. Meilin Xu
3. Jiawen Chen
4. Yucheng Li
5. Zhou Shen
This article has no evaluationsLatest version Aug 28, 2025
Traditional and Machine Learning Approaches to Partial Differential Equations: A Critical Review of Methods, Trade-Offs, and Integration

This article has 1 author:
1. Mohammad Nooraiepour
This article has no evaluationsLatest version Sep 4, 2025

Listed in

Abstract

Article activity feed

Related articles

Optimizing Deep Reinforcement Learning with Log Hyperbolic Cosine loss function: A Novel Approach

Joint Structure-Function Neural Architecture Optimization under Resource Constraints

Traditional and Machine Learning Approaches to Partial Differential Equations: A Critical Review of Methods, Trade-Offs, and Integration