The Module Gradient Descent Algorithm via L₂ Regularization for Wavelet Neural Networks

Khidir Shaib Mohamed
Ibrahim M.A. Suliman
Abdalilah Alhalangy
Alawia Adam
Muntasir Suhail
Habeeb Ibrahim
Mona Ahmed Mohamed
Sofian A. A. Saad
Yousif Shoaib Mohammed

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Although wavelet neural networks (WNNs) combine the expressive capability of neural models with multiscale localization, there are currently few theoretical guarantees for their training. We investigate the weight decay (L2 regularization) optimization dynamics of gradient descent (GD) for WNNs. Using explicit rates controlled by the spectrum of the regularized Gram matrix, we first demonstrate global linear convergence to the unique ridge solution for the feature regime when wavelet atoms are fixed and only the linear head is trained. Second, for fully trainable WNNs, we demonstrate linear rates in regions satisfying a Polyak–Łojasiewicz inequality and establish convergence of GD to stationary locations under standard smoothness and boundedness of wavelet parameters; weight decay enlarges these regions by suppressing flat directions. Third, we characterize the implicit bias in the over-parameterized (NTK) regime: GD converges to the minimum-RKHS-norm interpolant associated with the WNN kernel with L2. In addition to an assessment process on synthetic regression, denoising, and ablations across λ and stepsize, we supplement the theory with useful recommendations on initialization, stepsize schedules, and regularization scales. Together, our findings give a principled prescription for dependable training that has broad applicability to signal processing applications and shed light on when and why L2-regularized GD is stable and quick for WNNs.

Version published to 10.20944/preprints202510.1739.v1
Oct 22, 2025

Mathematical Foundations of Deep Learning

This article has 1 author:
1. Sourangshu Ghosh
This article has no evaluationsLatest version Oct 13, 2025
Learning local geometry and nonlinear topology of neural manifolds via spike-timing dependent plasticity

This article has 2 authors:
1. Nikolas Schonsheck
2. Chad Giusti
This article has no evaluationsLatest version Sep 1, 2025
Mathematical Foundations of Deep Learning

This article has 1 author:
1. Sourangshu Ghosh
This article has no evaluationsLatest version Oct 16, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Mathematical Foundations of Deep Learning

Learning local geometry and nonlinear topology of neural manifolds via spike-timing dependent plasticity

Mathematical Foundations of Deep Learning