Publication Details

ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining

MRÁZEK Vojtěch, VAŠÍČEK Zdeněk, SEKANINA Lukáš, HANIF Muhammad A. and SHAFIQUE Muhammad. ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining. In: Proceedings of the IEEE/ACM International Conference on Computer-Aided Design. Denver: Institute of Electrical and Electronics Engineers, 2019, pp. 1-8. ISBN 978-1-7281-2350-9. Available from: https://arxiv.org/abs/1907.07229

Czech title

ALWANN: Automatická aproximace vrstev neuronových sítích v akcelerátorech

Type

conference paper

Language

english

Authors

Mrázek Vojtěch, Ing., Ph.D. (DCSY FIT BUT)
Vašíček Zdeněk, doc. Ing., Ph.D. (DCSY FIT BUT)
Sekanina Lukáš, prof. Ing., Ph.D. (DCSY FIT BUT)
Hanif Muhammad A. (TU-Wien)
Shafique Muhammad (TU-Wien)

URL

https://arxiv.org/abs/1907.07229

Keywords

approximate computing, deep neural networks, computational path, ResNet, CIFAR-10

Abstract

The state-of-the-art approaches employ approximate computing to reduce the energy consumption of DNN hardware. Approximate DNNs then require extensive retraining afterwards to recover from the accuracy loss caused by the use of approximate operations. However, retraining of complex DNNs does not scale well. In this paper, we demonstrate that efficient approximations can be introduced into the computational path of DNN accelerators while retraining can completely be avoided.
ALWANN provides highly optimized implementations of DNNs for custom low-power accelerators in which the number of computing units is lower than the number of DNN layers. First, a fully trained DNN is converted to operate with 8-bit weights and 8-bit multipliers in convolutional layers. A suitable approximate multiplier is then selected for each computing element from a library of approximate multipliers in such a way that (i) one approximate multiplier serves several layers, and (ii) the overall classification error and energy consumption are minimized. The optimizations including the multiplier selection problem are solved by means of a multiobjective optimization NSGA-II algorithm. In order to completely avoid the computationally expensive retraining of DNN, which is usually employed to improve the classification accuracy, we propose a simple weight updating scheme that compensates the inaccuracy introduced by employing approximate multipliers. The proposed approach is evaluated for two architectures of DNN accelerators with approximate multipliers from the open-source "EvoApprox" library. We report that the proposed approach saves 30% of energy needed for multiplication in convolutional layers of ResNet-50 while the accuracy is degraded by only 0.6%. The proposed technique and approximate layers are available as an open-source extension of TensorFlow at https://github.com/ehw-fit/tf-approximate.

Published

2019

Pages

1-8

Proceedings

Proceedings of the IEEE/ACM International Conference on Computer-Aided Design

Conference

IEEE/ACM International Conference On Computer-Aided Design, Denver, CO, US

ISBN

978-1-7281-2350-9

Publisher

Institute of Electrical and Electronics Engineers

Place

Denver, US

DOI

10.1109/ICCAD45719.2019.8942068

UT WoS

000524676400028

EID Scopus

2-s2.0-85077802459

BibTeX

@INPROCEEDINGS{FITPUB11959,
   author = "Vojt\v{e}ch Mr\'{a}zek and Zden\v{e}k Va\v{s}\'{i}\v{c}ek and Luk\'{a}\v{s} Sekanina and A. Muhammad Hanif and Muhammad Shafique",
   title = "ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining",
   pages = "1--8",
   booktitle = "Proceedings of the IEEE/ACM International Conference on Computer-Aided Design",
   year = 2019,
   location = "Denver, US",
   publisher = "Institute of Electrical and Electronics Engineers",
   ISBN = "978-1-7281-2350-9",
   doi = "10.1109/ICCAD45719.2019.8942068",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11959"
}