A data‐aware scheduling strategy for workflow execution in clouds. (14th August 2017)
- Record Type:
- Journal Article
- Title:
- A data‐aware scheduling strategy for workflow execution in clouds. (14th August 2017)
- Main Title:
- A data‐aware scheduling strategy for workflow execution in clouds
- Authors:
- Marozzo, Fabrizio
Rodrigo Duro, Francisco
Garcia Blas, Javier
Carretero, Jesus
Talia, Domenico
Trunfio, Paolo - Other Names:
- Carretero Jesus guestEditor.
Garcia‐Blas Javier guestEditor.
Nakano Koji guestEditor.
Mueller Peter guestEditor.
Grosu Daniel guestEditor.
Zheng Sheng guestEditor.
Xu Li guestEditor.
Xu Zheng guestEditor.
Yen Neil guestEditor.
Sugumaran Vijayan guestEditor. - Abstract:
- Summary: As data intensive scientific computing systems become more widespread, there is a necessity of simplifying the development, deployment, and execution of complex data analysis applications for scientific discovery. The scientific workflow model is the leading approach for designing and executing data‐intensive applications in high‐performance computing infrastructures. Commonly, scientific workflows are built by a set of connected tasks arranged in a directed acyclic graph style, which communicate through storage abstractions. The Data Mining Cloud Framework (DMCF) is a system allowing users to design and execute data analysis workflows on cloud platforms, relying on cloud storage services for every I/O operation. Hercules is an in‐memory I/O solution that can be used in DMCF as an alternative to cloud storage services, providing additional performance and flexibility features. This work improves the integration between DMCF and Hercules by using a data‐aware scheduling strategy for exploiting data locality in data‐intensive workflows. This paper presents experimental results demonstrating the performance improvements achieved using the proposed data‐aware scheduling strategy in the Microsoft Azure cloud platform. In particular, with our scheduling strategy, the I/O overhead has been reduced by 55% with respect to the Azure storage, leading to a 20% reduction of the total execution time.
- Is Part Of:
- Concurrency and computation. Volume 29:Number 24(2017)
- Journal:
- Concurrency and computation
- Issue:
- Volume 29:Number 24(2017)
- Issue Display:
- Volume 29, Issue 24 (2017)
- Year:
- 2017
- Volume:
- 29
- Issue:
- 24
- Issue Sort Value:
- 2017-0029-0024-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2017-08-14
- Subjects:
- data‐aware scheduling -- DMCF -- Hercules -- in‐memory storage -- Microsoft Azure -- workflows
Parallel processing (Electronic computers) -- Periodicals
Parallel computers -- Periodicals
004.35 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/cpe.4229 ↗
- Languages:
- English
- ISSNs:
- 1532-0626
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3405.622000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 5422.xml