A fast and parallel algorithm for frequent pattern mining from big data in many-task environments. (2017)
- Record Type:
- Journal Article
- Title:
- A fast and parallel algorithm for frequent pattern mining from big data in many-task environments. (2017)
- Main Title:
- A fast and parallel algorithm for frequent pattern mining from big data in many-task environments
- Authors:
- Lin, Wei-Tee
Chu, Chih-Ping - Abstract:
- Many studies have tried to efficiently discover frequent patterns in large databases. The algorithms used in these studies fall into two main categories: apriori algorithms and frequent pattern growth (FP-growth) algorithms. Apriori algorithms operate according to a generate-and-test approach, so performance suffers from the testing of too many candidate itemsets. Therefore, most recent studies have applied an FP-growth approach to the discovery of frequent patterns. The rapid growth of data, however, has introduced new challenges for the mining of frequent patterns, in terms of both execution efficiency and scalability. Big data often contains a large number of items, a large number of transactions and long average transaction length, which result in large FP-trees. In addition to its dependence on data characteristics, FP-tree size is also sensitive to the minimum support threshold. This is because the small support is probable to bring many branches for nodes, greatly enlarging the FP-tree and the number of reconstructed conditional pattern-based trees. In this paper, we propose a novel algorithm and architecture for efficiently mining frequent patterns from big data in distributed many-task computing environments. Through empirical evaluation of various simulation conditions, we show that the proposed method delivers excellent execution time.
- Is Part Of:
- International journal of high performance computing and networking. Volume 10:Number 3(2017)
- Journal:
- International journal of high performance computing and networking
- Issue:
- Volume 10:Number 3(2017)
- Issue Display:
- Volume 10, Issue 3 (2017)
- Year:
- 2017
- Volume:
- 10
- Issue:
- 3
- Issue Sort Value:
- 2017-0010-0003-0000
- Page Start:
- 157
- Page End:
- 167
- Publication Date:
- 2017
- Subjects:
- data mining -- big data -- many-task computing -- frequent patterns mining
High performance computing -- Periodicals
Computer networks -- Periodicals
High performance computing
Periodicals
004.05 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijhpcn ↗
http://www.metapress.com/openurl.asp?genre=journal&issn=1740-0562 ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1740-0562
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 8956.xml