Denoising Neural Models with Spectral QUEnching and Eigenvalue ZEroing (SQUEEZE)

Pierre Dantas
Waldir Junior
Lucas Cordeiro
Eulanda dos Santos

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

\Glspl{llm} built with Transformer technology perform very well but requires substantial computational power. This paper presents \gls{squeeze}, a new framework that treats model compression as a signal-to-noise separation task. Using principles from \gls{rmt}, \gls{squeeze} identifies and preserves structural signals within weight matrices while discarding components that align with random noise. Unlike traditional methods, this approach serves as a post-training transformation that does not require retraining the model. Our analysis of fine-tuned \texttt{BERT-base} models reveals that matrix aspect ratios ($\beta$) significantly influence spectral behavior: rectangular feed-forward (\gls{ffn}) layers ($\beta =$ 0.25) adhere closely to the \gls{mp} and exhibit substantial redundancy, whereas square attention matrices ($\beta =$ 1) and highly rectangular embedding matrices ($\beta \ll$ 1) show significant departures from the null model. We implement a five-step pipeline -- standardization, \gls{svd} decomposition, baseline establishment, \gls{tw} finite-size adjustment, and rank truncation. Testing across three \gls{glue} tasks shows that \gls{squeeze} reduces model size by \SI{8.1}{\percent} with an accuracy loss of less than \SI{2.5}{\percent}. Specifically, targeting \gls{ffn} layers allows for a \SI{64}{\percent} reduction in effective rank while maintaining a cosine similarity of over 0.80. Our results show that spectral geometry is a critical factor in Transformer compressibility, positioning \gls{squeeze} as a high-fidelity alternative to aggressive methods such as quantization when model quality is the primary priority.

Version published to 10.21203/rs.3.rs-8720929/v1 on Research Square
Feb 16, 2026

Spectral Compactness Ensures Robustness in Low-Precision Neural Networks

This article has 1 author:
1. Jewon Moon
This article has no evaluationsLatest version Feb 24, 2026
Enhancing Quantum Diffusion Models for Complex Image Generation

This article has 5 authors:
1. Jeongbin Jo
2. Santanam Wishal
3. Shah Md Khalil Ul
4. Shan Zeng
5. Dikshant Dulal
This article has no evaluationsLatest version Feb 24, 2026
GLRE: Low-Rank Regularized Label Enhancement with Manifold Constraints

This article has 2 authors:
1. Yifan Fang
2. Lizhou Wu
This article has no evaluationsLatest version Feb 17, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Spectral Compactness Ensures Robustness in Low-Precision Neural Networks

Enhancing Quantum Diffusion Models for Complex Image Generation

GLRE: Low-Rank Regularized Label Enhancement with Manifold Constraints