NPIY : A novel partitioner for improving mapreduce performance. (June 2018)
- Record Type:
- Journal Article
- Title:
- NPIY : A novel partitioner for improving mapreduce performance. (June 2018)
- Main Title:
- NPIY : A novel partitioner for improving mapreduce performance
- Authors:
- Lu, Wei
Chen, Lei
Wang, Liqiang
Yuan, Haitao
Xing, Weiwei
Yang, Yong - Abstract:
- Abstract: MapReduce is an effective and widely-used framework for processing large datasets in parallel over a cluster of computers. Data skew, cluster heterogeneity, and network traffic are three issues that significantly affect the performance of MapReduce applications. However, the hash-based partitioner in the native Hadoop does not consider these factors. This paper proposes a new partitioner for Yarn (Hadoop 2.6.0), namely, NPIY, which adopts an innovative parallel sampling method to distribute intermediate data. The paper makes the following major contributions: (1) NPIY mitigates data skew in MapReduce applications; (2) NPIY considers the heterogeneity of computing resources to balance the loads among Reducers; (3) NPIY reduces the network traffic in the shuffle phase by trying to retain intermediate data on those nodes running both map and reduce tasks. Compared with the native Hadoop and other popular strategies, NPIY can reduce execution time by up to 41.66% and 58.68% in homogeneous and heterogeneous clusters, respectively. We further customize NPIY for parallel image processing, and the execution time has been improved by 28.8% compared with the native Hadoop.
- Is Part Of:
- Journal of visual languages & computing. Volume 46(2018)
- Journal:
- Journal of visual languages & computing
- Issue:
- Volume 46(2018)
- Issue Display:
- Volume 46, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 46
- Issue:
- 2018
- Issue Sort Value:
- 2018-0046-2018-0000
- Page Start:
- 1
- Page End:
- 11
- Publication Date:
- 2018-06
- Subjects:
- MapReduce -- Hadoop -- Data skew -- Load balance -- Data transmission amount -- Heterogeneous -- Parallel image processing
00-01 -- 99-00
Visual programming languages (Computer science) -- Periodicals
Visual programming (Computer science) -- Periodicals
Programming languages (Electronic computers) -- Semantics -- Periodicals
Langages de programmation visuelle -- Périodiques
Programmation visuelle -- Périodiques
Langages de programmation -- Sémantique -- Périodiques
Programming languages (Electronic computers) -- Semantics
Visual programming (Computer science)
Visual programming languages (Computer science)
Periodicals
Electronic journals
005 - Journal URLs:
- http://www.sciencedirect.com/science/journal/1045926X ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.jvlc.2018.04.001 ↗
- Languages:
- English
- ISSNs:
- 1045-926X
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5072.495200
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17082.xml