Dual tree traversal on integrated GPUs for astrophysical N-body simulations. (September 2019)
- Record Type:
- Journal Article
- Title:
- Dual tree traversal on integrated GPUs for astrophysical N-body simulations. (September 2019)
- Main Title:
- Dual tree traversal on integrated GPUs for astrophysical N-body simulations
- Authors:
- Fortin, Pierre
Touche, Maxime - Other Names:
- Mascagni Michael guest-editor.
- Abstract:
- In astrophysical N -body simulations, O ( N ) fast multipole methods (FMMs) with dual tree traversal (DTT) on multi-core CPUs are faster than O ( N log N ) CPU tree-codes but can still be outperformed by GPU ones. In this article, we aim at combining the best algorithm, namely FMM with DTT, with the most powerful hardware currently available, namely GPUs. In the astrophysical context requiring low accuracies and non-uniform particle distributions, we show that such combination can be achieved thanks to a hybrid CPU-GPU algorithm on integrated GPUs: while the DTT is performed on the CPU cores, the far- and near-field computations are all performed on the GPU cores. We show how to efficiently expose the interactions resulting from the DTT to the GPU cores, how to deploy both the far- and near-field computations on GPU, and how to overlap the parallel DTT on CPU with GPU computations. Based on the falcON code and using OpenCL on AMD Accelerated Processing Units and on Intel integrated GPUs, this first heterogeneous deployment of DTT for FMM outperforms standard multi-core CPUs and matches GPU and high-end CPU performance, being hence more cost- and power-efficient.
- Is Part Of:
- International journal of high performance computing applications. Volume 33:Number 5(2019)
- Journal:
- International journal of high performance computing applications
- Issue:
- Volume 33:Number 5(2019)
- Issue Display:
- Volume 33, Issue 5 (2019)
- Year:
- 2019
- Volume:
- 33
- Issue:
- 5
- Issue Sort Value:
- 2019-0033-0005-0000
- Page Start:
- 960
- Page End:
- 972
- Publication Date:
- 2019-09
- Subjects:
- Dual tree traversal -- integrated GPU -- hybrid CPU-GPU algorithm -- fast multipole method -- astrophysics
High performance computing -- Periodicals
Supercomputers -- Periodicals
004.1105 - Journal URLs:
- http://hpc.sagepub.com ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1177/1094342019840806 ↗
- Languages:
- English
- ISSNs:
- 1094-3420
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11071.xml