ZOE: Zero Overhead ECC Techniques for Flash Memory Used in AI Accelerators
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
One important role of flash memory is to store the trained weights for the state-of-the-art deep neural networks (DNNs). However, flash memory suffers from many reliability and endurance issues. Therefore, erroneous weights can degrade the classification accuracy, which is unacceptable for mission-critical applications. To address these challenges, a highly efficient zero-overhead Error Correction Code (ECC) technique named ZOE is proposed in this paper. By exploiting redundancies for weight representation, weights are partitioned into reducible weights (RWs) and irreducible weights (IRWs). Reducible weights can be represented with a shorter weight length and the saved bits can be used for storing check bits of the adopted ECC. For most DNN models, the weight values are mostly close to zero; therefore, the proportions of RWs are usually very high, allowing for many bits to be saved for ECC. Moreover, since a codeword of flash memory typically consists of many weights, the proportions of RW s in all codewords might vary. This variation can compromise the reduction efficiency of ZOE. Therefore, we also propose the weight leveling technique, which can evenly distribute RWs to all codewords. The algorithm for deriving control words to orchestrate the leveling is also proposed. The corresponding hardware architectures of ZOE are then developed. Experimental results demonstrate that the reliability and accuracy of typical DNN models can be significantly improved with negligible hardware overhead.