Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster. (April 2018)
- Record Type:
- Journal Article
- Title:
- Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster. (April 2018)
- Main Title:
- Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster
- Authors:
- Singh, Sudhakar
Garg, Rakhi
Mishra, P K - Abstract:
- Highlights: We proposed four MapReduce based Apriori algorithms using multi-pass MapReduce phase. Proposed algorithms are VFPC, ETDPC, Optimized-VFPC and Optimized-ETDPC. VFPC and ETDPC are more robust and flexible than existing FPC and DPC algorithms. Optimized-VFPC and Optimized-ETDPC skip pruning in some passes of multi-pass phases. Optimized-VFPC and Optimized-ETDPC outperform VFPC and ETDPC. Abstract: Many techniques have been proposed to implement the Apriori algorithm on MapReduce framework but only a few have focused on performance improvement. FPC (Fixed Passes Combined-counting) and DPC (Dynamic Passes Combined-counting) algorithms combine multiple passes of Apriori in a single MapReduce phase to reduce the execution time. In this paper, we propose improved MapReduce based Apriori algorithms VFPC (Variable Size based Fixed Passes Combined-counting) and ETDPC (Elapsed Time based Dynamic Passes Combined-counting) over FPC and DPC. Further, we optimize the multi-pass phases of these algorithms by skipping pruning step in some passes, and propose Optimized-VFPC and Optimized-ETDPC algorithms. Quantitative analysis reveals that counting cost of additional un-pruned candidates produced due to skipped-pruning is less significant than reduction in computation cost due to the same. Experimental results show that VFPC and ETDPC are more robust and flexible than FPC and DPC whereas their optimized versions are more efficient in terms of execution time.
- Is Part Of:
- Computers & electrical engineering. Volume 67(2018)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 67(2018)
- Issue Display:
- Volume 67, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 67
- Issue:
- 2018
- Issue Sort Value:
- 2018-0067-2018-0000
- Page Start:
- 348
- Page End:
- 364
- Publication Date:
- 2018-04
- Subjects:
- Algorithms -- Data mining -- Big data -- Frequent itemset -- Apriori -- Hadoop -- MapReduce
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2017.10.008 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17038.xml