Three practical workflow schedulers for easy maximum parallelism. (27th October 2021)
- Record Type:
- Journal Article
- Title:
- Three practical workflow schedulers for easy maximum parallelism. (27th October 2021)
- Main Title:
- Three practical workflow schedulers for easy maximum parallelism
- Authors:
- Rogers, David M.
- Other Names:
- Chandrasekaran Sunita guestEditor.
Si Min guestEditor.
Zhai Jidong guestEditor.
Oden Lena guestEditor. - Abstract:
- Abstract: Runtime scheduling and workflow systems are an increasingly popular algorithmic component in HPC because they allow full system utilization with relaxed synchronization requirements. There are so many special‐purpose tools for task scheduling, one might wonder why more are needed. Use cases seen on the Summit supercomputer needed better integration with MPI and greater flexibility in job launch configurations. Preparation, execution, and analysis of computational chemistry simulations at the scale of tens of thousands of processors revealed three distinct workflow patterns. A separate job scheduler was implemented for each one using extremely simple and robust designs: file‐based, task‐list based, and bulk‐synchronous. Comparing to existing methods shows unique benefits of this work, including simplicity of design, suitability for HPC centers, short startup time, and well‐understood per‐task overhead. All three new tools have been shown to scale to full utilization of Summit, and have been made publicly available with tests and documentation. This work presents a complete characterization of the minimum effective task granularity for efficient scheduler usage scenarios. These schedulers have the same bottlenecks, and hence similar task granularities as those reported for existing tools following comparable paradigms.
- Is Part Of:
- Software, practice & experience. Volume 53:Number 1(2023)
- Journal:
- Software, practice & experience
- Issue:
- Volume 53:Number 1(2023)
- Issue Display:
- Volume 53, Issue 1 (2023)
- Year:
- 2023
- Volume:
- 53
- Issue:
- 1
- Issue Sort Value:
- 2023-0053-0001-0000
- Page Start:
- 99
- Page End:
- 114
- Publication Date:
- 2021-10-27
- Subjects:
- distributed asynchronous -- runtime scheduling -- task graph -- workflow management systems
Computer software -- Periodicals
Computer programming -- Periodicals
Computer programs -- Periodicals
005.3 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/spe.3047 ↗
- Languages:
- English
- ISSNs:
- 0038-0644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8321.453000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 24724.xml