ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing. (23rd October 2014)
- Record Type:
- Journal Article
- Title:
- ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing. (23rd October 2014)
- Main Title:
- ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing
- Authors:
- Docan, Ciprian
Zhang, Fan
Jin, Tong
Bui, Hoang
Sun, Qian
Cummings, Julian
Podhorszki, Norbert
Klasky, Scott
Parashar, Manish - Abstract:
- Summary: Managing the large volumes of data produced by emerging scientific and engineering simulations running on leadership‐class resources has become a critical challenge. The data have to be extracted off the computing nodes and transported to consumer nodes so that it can be processed, analyzed, visualized, archived, and so on. Several recent research efforts have addressed data‐related challenges at different levels. One attractive approach is to offload expensive input/output operations to a smaller set of dedicated computing nodes known as a staging area . However, even using this approach, the data still have to be moved from the staging area to consumer nodes for processing, which continues to be a bottleneck. In this paper, we investigate an alternate approach, namely moving the data‐processing code to the staging area instead of moving the data to the data‐processing code. Specifically, we describe the ActiveSpaces framework, which provides (1) programming support for defining the data‐processing routines to be downloaded to the staging area and (2) runtime mechanisms for transporting codes associated with these routines to the staging area, executing the routines on the nodes that are part of the staging area, and returning the results. We also present an experimental performance evaluation of ActiveSpaces using applications running on the Cray XT5 at Oak Ridge National Laboratory. Finally, we use a coupled fusion application workflow to explore the trade‐offsSummary: Managing the large volumes of data produced by emerging scientific and engineering simulations running on leadership‐class resources has become a critical challenge. The data have to be extracted off the computing nodes and transported to consumer nodes so that it can be processed, analyzed, visualized, archived, and so on. Several recent research efforts have addressed data‐related challenges at different levels. One attractive approach is to offload expensive input/output operations to a smaller set of dedicated computing nodes known as a staging area . However, even using this approach, the data still have to be moved from the staging area to consumer nodes for processing, which continues to be a bottleneck. In this paper, we investigate an alternate approach, namely moving the data‐processing code to the staging area instead of moving the data to the data‐processing code. Specifically, we describe the ActiveSpaces framework, which provides (1) programming support for defining the data‐processing routines to be downloaded to the staging area and (2) runtime mechanisms for transporting codes associated with these routines to the staging area, executing the routines on the nodes that are part of the staging area, and returning the results. We also present an experimental performance evaluation of ActiveSpaces using applications running on the Cray XT5 at Oak Ridge National Laboratory. Finally, we use a coupled fusion application workflow to explore the trade‐offs between transporting data and transporting the code required for data processing during coupling, and we characterize sweet spots for each option. Copyright © 2014 John Wiley & Sons, Ltd. … (more)
- Is Part Of:
- Concurrency and computation. Volume 27:Number 14(2015:Sep.)
- Journal:
- Concurrency and computation
- Issue:
- Volume 27:Number 14(2015:Sep.)
- Issue Display:
- Volume 27, Issue 14 (2015)
- Year:
- 2015
- Volume:
- 27
- Issue:
- 14
- Issue Sort Value:
- 2015-0027-0014-0000
- Page Start:
- 3724
- Page End:
- 3745
- Publication Date:
- 2014-10-23
- Subjects:
- dynamic code deployment -- data processing -- data‐intensive application workflows -- coupled simulations
Parallel processing (Electronic computers) -- Periodicals
Parallel computers -- Periodicals
004.35 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/cpe.3407 ↗
- Languages:
- English
- ISSNs:
- 1532-0626
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3405.622000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 4470.xml