Distributed-memory lattice H-matrix factorization. (September 2019)
- Record Type:
- Journal Article
- Title:
- Distributed-memory lattice H-matrix factorization. (September 2019)
- Main Title:
- Distributed-memory lattice H-matrix factorization
- Authors:
- Yamazaki, Ichitaro
Ida, Akihiro
Yokota, Rio
Dongarra, Jack - Other Names:
- Mascagni Michael guest-editor.
- Abstract:
- We parallelize the LU factorization of a hierarchical low-rank matrix (H -matrix) on a distributed-memory computer. This is much more difficult than theH -matrix-vector multiplication due to the dataflow of the factorization, and it is much harder than the parallelization of a dense matrix factorization due to the irregular hierarchical block structure of the matrix. Block low-rank (BLR) format gets rid of the hierarchy and simplifies the parallelization, often increasing concurrency. However, this comes at a price of losing the near-linear complexity of theH -matrix factorization. In this work, we propose to factorize the matrix using a "latticeH -matrix" format that generalizes the BLR format by storing each of the blocks (both diagonals and off-diagonals) in theH -matrix format. These blocks stored in theH -matrix format are referred to as lattices. Thus, this lattice format aims to combine the parallel scalability of BLR factorization with the near-linear complexity ofH -matrix factorization. We first compare factorization performances using theH -matrix, BLR, and latticeH -matrix formats under various conditions on a shared-memory computer. Our performance results show that the lattice format has storage and computational complexities similar to those of theH -matrix format, and hence a much lower cost of factorization than BLR. We then compare the BLR and latticeH -matrix factorization on distributed-memory computers. Our performance results demonstrate that comparedWe parallelize the LU factorization of a hierarchical low-rank matrix (H -matrix) on a distributed-memory computer. This is much more difficult than theH -matrix-vector multiplication due to the dataflow of the factorization, and it is much harder than the parallelization of a dense matrix factorization due to the irregular hierarchical block structure of the matrix. Block low-rank (BLR) format gets rid of the hierarchy and simplifies the parallelization, often increasing concurrency. However, this comes at a price of losing the near-linear complexity of theH -matrix factorization. In this work, we propose to factorize the matrix using a "latticeH -matrix" format that generalizes the BLR format by storing each of the blocks (both diagonals and off-diagonals) in theH -matrix format. These blocks stored in theH -matrix format are referred to as lattices. Thus, this lattice format aims to combine the parallel scalability of BLR factorization with the near-linear complexity ofH -matrix factorization. We first compare factorization performances using theH -matrix, BLR, and latticeH -matrix formats under various conditions on a shared-memory computer. Our performance results show that the lattice format has storage and computational complexities similar to those of theH -matrix format, and hence a much lower cost of factorization than BLR. We then compare the BLR and latticeH -matrix factorization on distributed-memory computers. Our performance results demonstrate that compared with BLR, the lattice format with the lower cost of factorization may lead to faster factorization on the distributed-memory computer. … (more)
- Is Part Of:
- International journal of high performance computing applications. Volume 33:Number 5(2019)
- Journal:
- International journal of high performance computing applications
- Issue:
- Volume 33:Number 5(2019)
- Issue Display:
- Volume 33, Issue 5 (2019)
- Year:
- 2019
- Volume:
- 33
- Issue:
- 5
- Issue Sort Value:
- 2019-0033-0005-0000
- Page Start:
- 1046
- Page End:
- 1063
- Publication Date:
- 2019-09
- Subjects:
- boundary element method -- LU factorization -- distributed memory -- hierarchical matrix -- task programming
High performance computing -- Periodicals
Supercomputers -- Periodicals
004.1105 - Journal URLs:
- http://hpc.sagepub.com ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1177/1094342019861139 ↗
- Languages:
- English
- ISSNs:
- 1094-3420
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11071.xml