UNIST - AI대학원

연구성과

UNIST 인공지능대학원의 대학원 및 연구성과를 확인하실 수 있습니다.

Training-Free Stuck-At Fault Mitigation for ReRAM-Based Deep Learning Accelerators (IEEE Trans. Comput-Aided Des. Integr. Circuits Syst.), Prof. Jongeun Lee

2023
01.01 - 12.31
Download

Training-Free Stuck-At Fault Mitigation for ReRAM-Based Deep Learning Accelerators

Author: Chenghao Quan , Mohammed E. Fouda , Senior Member, IEEE, Sugil Lee , Giju Jung, Jongeun Lee , Member, IEEE, Ahmed E. Eltawil , Senior Member, IEEE, and Fadi Kurdahi , Fellow, IEEE

Abstract—Although Resistive RAMs can support highly efficient matrix–vector multiplication, which is very useful for machine learning and other applications, the nonideal behavior of hardware, such as stuck-at fault (SAF) and IR drop is an important concern in making ReRAM crossbar array-based deep learning accelerators. Previous work has addressed the nonideality problem through either redundancy in hardware, which requires a permanent increase of hardware cost, or software retraining, which may be even more costly or unacceptable due to its need for a training dataset as well as high computation overhead. In this article, we propose a very lightweight method that can be applied on top of existing hardware or software solutions. Our method, called forward-parameter tuning (FPT), takes advantage of a certain statistical property existing in the activation data of neural network layers, and can mitigate the impact of mild nonidealities in ReRAM crossbar arrays (RCAs) for deep learning applications without using any hardware, a dataset, or gradient-based training. Our experimental results using MNIST, CIFAR-10, and CIFAR-100, and ImageNet datasets in binary and multibit networks demonstrate that our technique is very effective, both alone and together with previous methods, up to 20% fault rate, which is higher than even some of the previous remapping methods. We also evaluate our method in the presence of other nonidealities, such as variability and IR drop. Furthermore, we provide an analysis based on the concept of the effective fault rate (EFR), which not only demonstrates that EFR can be a useful tool to predict the accuracy of faulty RCA-based neural networks but also explains why mitigating the SAF problem is more difficult with multibit neural networks. Index Terms—Accelerator, artificial neural network, batch normalization (BN), ReRAM crossbar array, stuck-at fault (SAF)

대학원 소개

연구성과

UNIST 인공지능대학원의 대학원 및 연구성과를 확인하실 수 있습니다.