Minimizing total job completion time in MapReduce scheduling. (August 2021)
- Record Type:
- Journal Article
- Title:
- Minimizing total job completion time in MapReduce scheduling. (August 2021)
- Main Title:
- Minimizing total job completion time in MapReduce scheduling
- Authors:
- Dong, Jianming
Goebel, Randy
Hu, Jueliang
Lin, Guohui
Su, Bing - Abstract:
- Highlights: Study MapReduce scheduling as a generalization of two stage flow-shop scheduling. prove a new lower bound on the total job completion time. present an improved (9–1/m_1)-approximation algorithm. design numerical experiments to demonstrate effectiveness of the new lower bound. demonstrate the practical outperformance of the approximation algorithm. Abstract: We follow up an earlier studied multiple-task parallel-machine scheduling model that captures the core challenges in MapReduce scheduling, with the optimization goal to minimize the total job completion time. The problem is a novel generalization of the classic two-stage flow-shop scheduling, in which we are given a set of jobs each is associated with multiple map tasks and multiple reduce tasks. All these tasks are non-preemptive to be processed on multiple parallel identical map machines and multiple parallel identical reduce machines, respectively, under the strict precedence constraints that, for each job, any reduce task cannot start before all its map tasks have been finished. We prove a new lower bound on the total job completion time, and based on which we present an O ( n log n + N + m 1 + m 2 ) -time ( 9 - 3 m 1 ) -approximation algorithm, where n and N are the number of jobs and the total number of tasks, respectively, m 1 and m 2 are the numbers of map and reduce machines, respectively, and they can even be part of the input. Our algorithm improves the previous best 12-approximation, and it reducesHighlights: Study MapReduce scheduling as a generalization of two stage flow-shop scheduling. prove a new lower bound on the total job completion time. present an improved (9–1/m_1)-approximation algorithm. design numerical experiments to demonstrate effectiveness of the new lower bound. demonstrate the practical outperformance of the approximation algorithm. Abstract: We follow up an earlier studied multiple-task parallel-machine scheduling model that captures the core challenges in MapReduce scheduling, with the optimization goal to minimize the total job completion time. The problem is a novel generalization of the classic two-stage flow-shop scheduling, in which we are given a set of jobs each is associated with multiple map tasks and multiple reduce tasks. All these tasks are non-preemptive to be processed on multiple parallel identical map machines and multiple parallel identical reduce machines, respectively, under the strict precedence constraints that, for each job, any reduce task cannot start before all its map tasks have been finished. We prove a new lower bound on the total job completion time, and based on which we present an O ( n log n + N + m 1 + m 2 ) -time ( 9 - 3 m 1 ) -approximation algorithm, where n and N are the number of jobs and the total number of tasks, respectively, m 1 and m 2 are the numbers of map and reduce machines, respectively, and they can even be part of the input. Our algorithm improves the previous best 12-approximation, and it reduces to the best approximation algorithms for several interesting or well studied special cases. We confirm through numerical experiments on 828, 100 instances that both our lower bound and our algorithm significantly outperform: the empirical mean improvement ratio of the new lower bound is as high as 37.75 %, and the empirical mean approximation ratio of our algorithm is only 1.7582 . … (more)
- Is Part Of:
- Computers & industrial engineering. Volume 158(2021)
- Journal:
- Computers & industrial engineering
- Issue:
- Volume 158(2021)
- Issue Display:
- Volume 158, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 158
- Issue:
- 2021
- Issue Sort Value:
- 2021-0158-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-08
- Subjects:
- MapReduce scheduling -- Two-stage flow-shop -- Parallel identical machines -- Shortest processing time -- Approximation algorithm
Engineering -- Data processing -- Periodicals
Industrial engineering -- Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03608352 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cie.2021.107387 ↗
- Languages:
- English
- ISSNs:
- 0360-8352
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.713000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17323.xml