Sparse convolutional neural network acceleration with lossless input feature map compression for resource‐constrained systems. Issue 1 (29th November 2021)
- Record Type:
- Journal Article
- Title:
- Sparse convolutional neural network acceleration with lossless input feature map compression for resource‐constrained systems. Issue 1 (29th November 2021)
- Main Title:
- Sparse convolutional neural network acceleration with lossless input feature map compression for resource‐constrained systems
- Authors:
- Kwon, Jisu
Kong, Joonho
Munir, Arslan - Abstract:
- Abstract: Many recent research efforts have exploited data sparsity for the acceleration of convolutional neural network (CNN) inferences. However, the effects of data transfer between main memory and the CNN accelerator have been largely overlooked. In this work, the authors propose a CNN acceleration technique that leverages hardware/software co‐design and exploits the sparsity in input feature maps (IFMs). On the software side, the authors' technique employs a novel lossless compression scheme for IFMs, which are sent to the hardware accelerator via direct memory access. On the hardware side, the authors' technique uses a CNN inference accelerator that performs convolutional layer operations with their compressed data format. With several design optimization techniques, the authors have implemented their technique in a field‐programmable gate array (FPGA) system‐on‐chip platform and evaluated their technique for six different convolutional layers in SqueezeNet. Results reveal that the authors' technique improves the performance by 1.1×–22.6× while reducing energy consumption by 47.7%–97.4% as compared to the CPU‐based execution. Furthermore, results indicate that the IFM size and transfer latency are reduced by 34.0%–85.2% and 4.4%–75.7%, respectively, compared to the case without data compression. In addition, the authors' hardware accelerator shows better performance per hardware resource with less than or comparable power consumption to the state‐of‐the‐art FPGA‐basedAbstract: Many recent research efforts have exploited data sparsity for the acceleration of convolutional neural network (CNN) inferences. However, the effects of data transfer between main memory and the CNN accelerator have been largely overlooked. In this work, the authors propose a CNN acceleration technique that leverages hardware/software co‐design and exploits the sparsity in input feature maps (IFMs). On the software side, the authors' technique employs a novel lossless compression scheme for IFMs, which are sent to the hardware accelerator via direct memory access. On the hardware side, the authors' technique uses a CNN inference accelerator that performs convolutional layer operations with their compressed data format. With several design optimization techniques, the authors have implemented their technique in a field‐programmable gate array (FPGA) system‐on‐chip platform and evaluated their technique for six different convolutional layers in SqueezeNet. Results reveal that the authors' technique improves the performance by 1.1×–22.6× while reducing energy consumption by 47.7%–97.4% as compared to the CPU‐based execution. Furthermore, results indicate that the IFM size and transfer latency are reduced by 34.0%–85.2% and 4.4%–75.7%, respectively, compared to the case without data compression. In addition, the authors' hardware accelerator shows better performance per hardware resource with less than or comparable power consumption to the state‐of‐the‐art FPGA‐based designs. … (more)
- Is Part Of:
- IET computers & digital techniques. Volume 16:Issue 1(2022)
- Journal:
- IET computers & digital techniques
- Issue:
- Volume 16:Issue 1(2022)
- Issue Display:
- Volume 16, Issue 1 (2022)
- Year:
- 2022
- Volume:
- 16
- Issue:
- 1
- Issue Sort Value:
- 2022-0016-0001-0000
- Page Start:
- 29
- Page End:
- 43
- Publication Date:
- 2021-11-29
- Subjects:
- accelerator -- compression -- convolutional neural networks -- field programmable gate array -- input sparsity
Computers -- Periodicals
Digital electronics -- Periodicals
Computer engineering -- Periodicals
Computer architecture -- Periodicals
Computer organization -- Periodicals
621.39 - Journal URLs:
- http://digital-library.theiet.org/content/journals/iet-cdt ↗
http://ieeexplore.ieee.org/servlet/opac?punumber=4117424 ↗
http://www.ietdl.org/IET-CDT ↗
https://ietresearch.onlinelibrary.wiley.com/journal/1751861x ↗
http://www.theiet.org/ ↗ - DOI:
- 10.1049/cdt2.12038 ↗
- Languages:
- English
- ISSNs:
- 1751-8601
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4363.252300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20811.xml