Publication Details

Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators

KLHŮFEK, J.; ŠAFÁŘ, M.; MRÁZEK, V.; VAŠÍČEK, Z.; SEKANINA, L. Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators. In 2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS). Kielce: Institute of Electrical and Electronics Engineers, 2024. p. 1-6. ISBN: 979-8-3503-5934-3.
Czech title
Výzkum synergie kvantizace a mapování v oblasti hardwarových akcelerátorů hlubokých neuronových sítí
Type
conference paper
Language
English
Authors
URL
Keywords

Quantization, Neural networks, Hardware accelerator

Abstract

Energy efficiency and memory footprint of a convolutional neural network (CNN) implemented on a CNN inference
accelerator depend on many factors, including a weight quantization strategy (i.e., data types and bit-widths) and mapping
(i.e., placement and scheduling of DNN elementary operations on
hardware units of the accelerator). We show that enabling rich
mixed quantization schemes during the implementation can open
a previously hidden space of mappings that utilize the hardware
resources more effectively. CNNs utilizing quantized weights
and activations and suitable mappings can significantly improve
trade-offs among the accuracy, energy, and memory requirements
compared to less carefully optimized CNN implementations.
To find, analyze, and exploit these mappings, we: (i) extend
a general-purpose state-of-the-art mapping tool (Timeloop) to
support mixed quantization, which is not currently available;
(ii) propose an efficient multi-objective optimization algorithm
to find the most suitable bit-widths and mapping for each DNN
layer executed on the accelerator; and (iii) conduct a detailed
experimental evaluation to validate the proposed method. On
two CNNs (MobileNetV1 and MobileNetV2) and two accelerators
(Eyeriss and Simba) we show that for a given quality metric
(such as the accuracy on ImageNet), energy savings are up to
37% without any accuracy drop. 

Published
2024
Pages
1–6
Proceedings
2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)
Conference
International Symposium on Design and Diagnostics of Electronic Circuits and Systems, Kielce, PL
ISBN
979-8-3503-5934-3
Publisher
Institute of Electrical and Electronics Engineers
Place
Kielce
DOI
EID Scopus
BibTeX
@inproceedings{BUT188463,
  author="Jan {Klhůfek} and Miroslav {Šafář} and Vojtěch {Mrázek} and Zdeněk {Vašíček} and Lukáš {Sekanina}",
  title="Exploiting Quantization and Mapping Synergy in Hardware-Aware Deep Neural Network Accelerators",
  booktitle="2024 27th International Symposium on Design & Diagnostics of Electronic Circuits & Systems (DDECS)",
  year="2024",
  pages="1--6",
  publisher="Institute of Electrical and Electronics Engineers",
  address="Kielce",
  doi="10.1109/DDECS60919.2024.10508920",
  isbn="979-8-3503-5934-3",
  url="https://arxiv.org/abs/2404.05368"
}
Back to top