Fisher-Aware Adaptive Mixed-Precision Ternary Hybrid Quantization

vibhor joshi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Deployment of deep neural networks to resource-limited platforms is a difficult task due to their reliance on computationally and memory-intensive 32-bit floating point operations. Though the problem may be addressed through binary and ternary quantization approaches which allow for great memory reduction, applying the same level of precision to all layers adversely affects model performance due to different sensitivity of individual layers to quantization. In response to the forementioned concerns, Fisher-Aware Ternary Hybrid (FATH) quantization technique is presented. It builds upon binary neural networks, introducing ternary quantization while allowing for varying levels of precision applied to each of the layers. Based on estimation of layer-wise Fisher Information as a proxy of Hessian trace, FATH allows for assigning optimal precision to layers, whether it is 16-bit or 4-bit float or ternary quantization. As part of ternary representation, weights are normalized using absolute mean factor and threshold and only values −1, 0, +1 are allowed, allowing for filtering less important features. In order to ensure training stability in light of discontinuous parameters, quantization aware training is implemented alongside the straight-through estimator. In addition to weight quantization, activations are quantized down to 8 bits. The approach was evaluated using a subset of the Food-101 data set and achieved good results in terms of maintaining model performance despite significant model reduction.

Version published to 10.21203/rs.3.rs-9378450/v1 on Research Square
Apr 14, 2026

A Lightweight Perceptual-Guided VQVAE for High-Fidelity Image Compression

This article has 4 authors:
1. Zhisong Bie
2. Yunyang Kuang
3. Haobo Lei
4. Hongxia Bie
This article has no evaluationsLatest version Mar 18, 2026
A Lightweight Neural Network Compression Pipeline for Resource-Constrained Edge AI Systems

This article has 1 author:
1. SOM SUBHRO NATH
This article has no evaluationsLatest version Apr 2, 2026
Enhancing Robustness in Automatic Modulation Classification via Energy-Guided Multi-Scale

This article has 1 author:
1. Jiahua Zhou
This article has no evaluationsLatest version Mar 18, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Lightweight Perceptual-Guided VQVAE for High-Fidelity Image Compression

A Lightweight Neural Network Compression Pipeline for Resource-Constrained Edge AI Systems

Enhancing Robustness in Automatic Modulation Classification via Energy-Guided Multi-Scale