A sparse matrix vector multiplication accelerator based on high-bandwidth memory. (January 2023)
- Record Type:
- Journal Article
- Title:
- A sparse matrix vector multiplication accelerator based on high-bandwidth memory. (January 2023)
- Main Title:
- A sparse matrix vector multiplication accelerator based on high-bandwidth memory
- Authors:
- Li, Tao
Shen, Li - Abstract:
- Abstract: Sparse matrix–vector multiplication (SpMV) is an important kernel that is widely used in science and engineering applications. The features of SpMV, such as high memory-intensiveness and many different access patterns, cause the performance of SpMV to be bounded by the limited bandwidth between memory and processing units. Processing in memory (PIM) is a novel architecture used to overcome the bandwidth bottleneck by shortening the distance between processing elements (PE) and memory. In this paper, we propose a PIM-structure SpMV accelerator based on high-bandwidth memory (HBM). To make full use of the high bandwidth provided by HBM, we design a highly parallel PE array and implement a high-frequency pipeline inside the PE to hide the latency of reading matrix elements from HBM. For each PE, we integrate an L1 cache to exploit the data locality in the vector. We propose two data layout strategies, namely a row merging algorithm to exploit the inter-row data locality and a row assignment algorithm to achieve workload balance among PEs. Our design is implemented using a field programmable gate array (FPGA) card with 8GB HBM2 memory. Compared to the baseline central processing unit (CPU) SpMV implementation, our accelerator can obtain a 5.24x performance speedup on average. Graphical abstract: Highlights: We propose a HBM-based PIM structure SpMV accelerator. It has a highly parallel PE array. In each PE, we integrate L1 cache to exploit the data locality. ComparedAbstract: Sparse matrix–vector multiplication (SpMV) is an important kernel that is widely used in science and engineering applications. The features of SpMV, such as high memory-intensiveness and many different access patterns, cause the performance of SpMV to be bounded by the limited bandwidth between memory and processing units. Processing in memory (PIM) is a novel architecture used to overcome the bandwidth bottleneck by shortening the distance between processing elements (PE) and memory. In this paper, we propose a PIM-structure SpMV accelerator based on high-bandwidth memory (HBM). To make full use of the high bandwidth provided by HBM, we design a highly parallel PE array and implement a high-frequency pipeline inside the PE to hide the latency of reading matrix elements from HBM. For each PE, we integrate an L1 cache to exploit the data locality in the vector. We propose two data layout strategies, namely a row merging algorithm to exploit the inter-row data locality and a row assignment algorithm to achieve workload balance among PEs. Our design is implemented using a field programmable gate array (FPGA) card with 8GB HBM2 memory. Compared to the baseline central processing unit (CPU) SpMV implementation, our accelerator can obtain a 5.24x performance speedup on average. Graphical abstract: Highlights: We propose a HBM-based PIM structure SpMV accelerator. It has a highly parallel PE array. In each PE, we integrate L1 cache to exploit the data locality. Compared to the basic SpMV implementations on CPUs, our accelerator can obtain a 5.24x performance speedup on average. We propose two data layout strategies, namely row merging and group assignment, to exploit inter-row locality and improve the workload balance. Compared to the accelerator without any data layout optimization, our design can obtain a 1.9x performance speedup on average. … (more)
- Is Part Of:
- Computers & electrical engineering. Volume 105(2023)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 105(2023)
- Issue Display:
- Volume 105, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 105
- Issue:
- 2023
- Issue Sort Value:
- 2023-0105-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- Sparse matrix-vector multiplication -- Processor in memory -- High bandwidth memory -- Accelerator -- Data layout
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2022.108488 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25029.xml