Systematic adaptation of stencil‐based 3D MPDATA to GPU architectures. (16th September 2016)
- Record Type:
- Journal Article
- Title:
- Systematic adaptation of stencil‐based 3D MPDATA to GPU architectures. (16th September 2016)
- Main Title:
- Systematic adaptation of stencil‐based 3D MPDATA to GPU architectures
- Authors:
- Rojek, Krzysztof
Wyrzykowski, Roman
Kuczynski, Lukasz - Other Names:
- Melab Nouredine guestEditor.
Mezmaz Mohand guestEditor.
Wyrzykowski Roman guestEditor.
Szymanski Boleslaw K. guestEditor. - Abstract:
- Summary: In this work, we focus on a systematic adaptation of the stencil‐based multidimensional positive definite advection transport algorithm (MPDATA) to different graphics processing unit (GPU)‐based computing platforms. Another objective of this work is to compare the performance of MPDATA on several platforms, including a multi‐GPU system with two NVIDIA Tesla K80 cards, and single‐card platforms with Tesla K20X, GeForce GTX TITAN, and GeForce GTX 980. The usage of the following optimization methods is proposed to improve the overall performance: (i) reducing the number of operations by the subexpression elimination when implementing 2.5D blocking; (ii) reorganization of boundary conditions for reducing branch instructions; (iii) advanced memory management to increase the coalesced memory access; and (iv) warps rearrangement for optimizing the data access to GPU global memory. The presented methods of the MPDATA adaptation to GPU architectures allow us to efficiently use many graphics processors within a single node by applying peer‐to‐peer data transfers between GPU global memories. We propose an auto‐tuning procedure to compensate architectural differences between the considered platforms. This procedure takes into account algorithm/GPU‐specific parameters. The proposed approach to adaptation of MPDATA to GPU architectures allows us to achieve up to 482.5 Gflop/s for the platform equipped with two NVIDIA K80 GPUs. Copyright © 2016 John Wiley & Sons, Ltd.
- Is Part Of:
- Concurrency and computation. Volume 29:Number 9(2017)
- Journal:
- Concurrency and computation
- Issue:
- Volume 29:Number 9(2017)
- Issue Display:
- Volume 29, Issue 9 (2017)
- Year:
- 2017
- Volume:
- 29
- Issue:
- 9
- Issue Sort Value:
- 2017-0029-0009-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2016-09-16
- Subjects:
- GPU -- Kepler and Maxwell architectures -- stencils -- MPDATA -- CUDA -- auto‐tuning
Parallel processing (Electronic computers) -- Periodicals
Parallel computers -- Periodicals
004.35 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/cpe.3970 ↗
- Languages:
- English
- ISSNs:
- 1532-0626
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3405.622000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 381.xml