Towards designing a hardware accelerator for 3D convolutional neural networks. (January 2023)
- Record Type:
- Journal Article
- Title:
- Towards designing a hardware accelerator for 3D convolutional neural networks. (January 2023)
- Main Title:
- Towards designing a hardware accelerator for 3D convolutional neural networks
- Authors:
- Khan, Fatima Hameed
Pasha, Muhammad Adeel
Masud, Shahid - Abstract:
- Highlights: The hardware design of 3D CNNs requires massive compute and memory resources. A 3-dimensional design space exploration is performed to reduce the complexity of the model. First, an efficient word-length is found that reduces the feature map and kernel sizes. Next, input data tiling is performed to efficiently utilized the on-chip memory. Then, different modes of parallelization are explored to minimize the off-chip accesses. Finally, an FPGA-based complete hardware accelerator is proposed that can achieve a throughput of 1.29TOPs/s in inference stage. Abstract: The hardware design of 3D Convolution Neural Networks (CNNs) requires massive compute and memory due to an additional temporal dimension. This paper explores various design parameters for 3D CNN that enable an efficient implementation of such complex network on resource-limited platforms. A new Inception-based 3D CNN model, the I3D has been chosen for investigating and optimizing its design parameters. The I3D model is a deep network with over 70 layers, and it is used for action recognition in videos. The complexity of this model is first reduced by adjusting the word lengths of feature maps and weights in a pre-trained model while retaining a negligible drop in accuracy. Second, a data tiling technique is proposed that exploits five dimensions of a video data volume to obtain improved memory bandwidth and reduced Dynamic Random Access Memory (DRAM) accesses. Finally, based on these optimizations,Highlights: The hardware design of 3D CNNs requires massive compute and memory resources. A 3-dimensional design space exploration is performed to reduce the complexity of the model. First, an efficient word-length is found that reduces the feature map and kernel sizes. Next, input data tiling is performed to efficiently utilized the on-chip memory. Then, different modes of parallelization are explored to minimize the off-chip accesses. Finally, an FPGA-based complete hardware accelerator is proposed that can achieve a throughput of 1.29TOPs/s in inference stage. Abstract: The hardware design of 3D Convolution Neural Networks (CNNs) requires massive compute and memory due to an additional temporal dimension. This paper explores various design parameters for 3D CNN that enable an efficient implementation of such complex network on resource-limited platforms. A new Inception-based 3D CNN model, the I3D has been chosen for investigating and optimizing its design parameters. The I3D model is a deep network with over 70 layers, and it is used for action recognition in videos. The complexity of this model is first reduced by adjusting the word lengths of feature maps and weights in a pre-trained model while retaining a negligible drop in accuracy. Second, a data tiling technique is proposed that exploits five dimensions of a video data volume to obtain improved memory bandwidth and reduced Dynamic Random Access Memory (DRAM) accesses. Finally, based on these optimizations, complete architecture of a Field Programmable Gate Array (FPGA) based hardware accelerator is proposed that can achieve a throughput of 684 GOPs/s using 32-bit floating point (FP) and 1.29 TOPs/s for 8-bit integer implementations with only 2% drop in accuracy. Graphical abstract: Image, graphical abstract … (more)
- Is Part Of:
- Computers & electrical engineering. Volume 105(2023)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 105(2023)
- Issue Display:
- Volume 105, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 105
- Issue:
- 2023
- Issue Sort Value:
- 2023-0105-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- 3D CNN -- Hardware architecture -- Design space -- FPGA implementation -- I3D network -- Input tiling -- Memory access optimization -- Time space mapping
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2022.108489 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25029.xml