Result Details

Acceleration Techniques for Automated Design of Approximate Convolutional Neural Networks

PIŇOS, M.; MRÁZEK, V.; VAVERKA, F.; VAŠÍČEK, Z.; SEKANINA, L. Acceleration Techniques for Automated Design of Approximate Convolutional Neural Networks. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2023, vol. 13, no. 1, p. 212-224. ISSN: 2156-3357.

Type

journal article

Language

English

Authors

Piňos Michal, Ing., DCSY (FIT)
Mrázek Vojtěch, Ing., Ph.D., DCSY (FIT)
Vaverka Filip, Ing., Ph.D.
Vašíček Zdeněk, doc. Ing., Ph.D., DCSY (FIT)
Sekanina Lukáš, prof. Ing., Ph.D., DCSY (FIT)

Abstract

The main issue connected with using approximate components such as approximate multipliers in deep convolutional neural networks (CNN) during the design process is the necessity to emulate them due to the lack of native support for approximate operations in modern CPUs and GPUs, which is computationally expensive. To accelerate the emulation of approximate operations of CNNs on GPUs, we propose TFApprox4IL, a software library supporting both symmetric as well as asymmetric quantization modes, approximate 8xN bit multipliers emulated using lookup tables, a new type of approximate layer known as approximate depthwise convolution, and quantization-aware training. The TFApprox4IL performance is extensively evaluated in the simulation of approximate implementations of MobileNetV2 and ResNet networks on Nvidia Pascal and Tesla GPU architectures. Furthermore, TFApprox4IL is also evaluated in neural architecture search (NAS) algorithms to automatically design CNN architectures that directly employ approximate multipliers. On two different NAS methods, EvoApproxNAS and Google Model Search (GMS), we show how approximate multipliers can effectively be incorporated into the CNN design process. To estimate the energy consumption of the approximate CNNs, AxMultAT tool based on Timeloop and Accelergy is introduced. Contrasted to the highly optimized GPU-based CNN simulation implemented using exact arithmetic operations available within TensorFlow, the average overhead of the inference and training, introduced by TFApprox4IL, is 13.6× and 8.0× , respectively, considering ResNet50V2 and MobileNetV2 CNNs on ImageNet and CIFAR-10 data sets. This overhead was reduced by one order of magnitude with respect to previous methods.

Keywords

Approximate computing,
convolutional neural network,
neural architecture search,
energy efficiency,
quantization,
acceleration

URL

https://ieeexplore.ieee.org/document/10011413

Published

2023

Pages

212–224

Journal

IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 13, no. 1, ISSN 2156-3357

DOI

10.1109/JETCAS.2023.3235204

UT WoS

000965262200001

EID Scopus

2-s2.0-85147308000

BibTeX

@article{BUT180721,
  author="Michal {Piňos} and Vojtěch {Mrázek} and Filip {Vaverka} and Zdeněk {Vašíček} and Lukáš {Sekanina}",
  title="Acceleration Techniques for Automated Design of Approximate Convolutional Neural Networks",
  journal="IEEE Journal on Emerging and Selected Topics in Circuits and Systems",
  year="2023",
  volume="13",
  number="1",
  pages="212--224",
  doi="10.1109/JETCAS.2023.3235204",
  issn="2156-3357",
  url="https://ieeexplore.ieee.org/document/10011413"
}

Projects

Automated design of hardware accelerators for resource-aware machine learning, GACR, Standardní projekty, GA21-13001S, start: 2021-01-01, end: 2023-12-31, completed

Research groups

Evolvable Hardware Research Group (RG EHW)

Departments

Department of Computer Systems (DCSY)