Low-Rank Optimization for Efficient Compression of CNN Models

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Tensor decomposition is an important method for compressing convolutional neural network (CNN) models. However, in the decomposition process, it is necessary to configure appropriate rank parameters for each convolutional kernel tensor. To address the difficulty in setting ranks, we propose a low-rank optimization algorithm based on information entropy. By solving the optimization problems, this algorithm can automatically learn the low-rank structure and rank parameters of convolutional kernel tensors, achieving global automatic configuration while ensuring model accuracy. Moreover, we design a weight generator for the network after tensor decomposition, which dynamically assesses the importance of filters of low-dimensional convolutional kernel tensors on a global scale. Indeed, pruning in the low-dimensional space can further enhance compression effects with minimal loss in accuracy. By testing various CNN models on different datasets, the results show that the proposed low-rank optimization algorithm can obtain all rank parameters in a single training process, and the average accuracy loss of the decomposed model does not exceed 1%. Meanwhile, the pruning method in low-dimensional space can achieve a compression ratio of over 4.7× with an accuracy loss of less than 1.3%.

Article activity feed