Fault-tolerant tile mining. (1st July 2018)
- Record Type:
- Journal Article
- Title:
- Fault-tolerant tile mining. (1st July 2018)
- Main Title:
- Fault-tolerant tile mining
- Authors:
- Lu, Haibing
Zhu, Wendong
Phan, Joseph
Ghiassi, M.
Fang, Yi
Hong, Yuan
He, Xiaoyun - Abstract:
- Highlights: A new interestingness measure for data mining is proposed. Theoretical properties on the new problem are presented. Efficient branch-and-bound algorithms are developed. Extensive experimental studies on both synthetic and real datasets are conducted. Abstract: Interesting itemset mining is a fundamental research problem in knowledge management and machine learning. It is intended to identify interesting relations between variables in a database using some measures of interestingness and has a number of applications, including market basket analysis, web usage mining, intrusion detection, and many others. This paper proposes a new interestingness measure, the fault-tolerant tile. That is based on two observations: (1) the length of an itemset can be as important as its frequency; (2) knowledge discovery from real-world datasets calls for fault-tolerant data mining (e.g. extracting fault-tolerant association rules, analyzing noisy datasets). Given a user-defined fault tolerance value, we are interested in finding the maximum/top-k fault-tolerant tiles. Due to the exponential search space of candidate itemsets, both problems are NP-hard. While using some monotonic property to prune search space is a common strategy for interesting itemset mining, no monotonic property is available for this problem. To tackle the challenge, we utilize the branch-and-bound search strategy to analyze the characteristics of candidate itemsets at each searching branch and estimatingHighlights: A new interestingness measure for data mining is proposed. Theoretical properties on the new problem are presented. Efficient branch-and-bound algorithms are developed. Extensive experimental studies on both synthetic and real datasets are conducted. Abstract: Interesting itemset mining is a fundamental research problem in knowledge management and machine learning. It is intended to identify interesting relations between variables in a database using some measures of interestingness and has a number of applications, including market basket analysis, web usage mining, intrusion detection, and many others. This paper proposes a new interestingness measure, the fault-tolerant tile. That is based on two observations: (1) the length of an itemset can be as important as its frequency; (2) knowledge discovery from real-world datasets calls for fault-tolerant data mining (e.g. extracting fault-tolerant association rules, analyzing noisy datasets). Given a user-defined fault tolerance value, we are interested in finding the maximum/top-k fault-tolerant tiles. Due to the exponential search space of candidate itemsets, both problems are NP-hard. While using some monotonic property to prune search space is a common strategy for interesting itemset mining, no monotonic property is available for this problem. To tackle the challenge, we utilize the branch-and-bound search strategy to analyze the characteristics of candidate itemsets at each searching branch and estimating their bounds. Our experimental results show that our algorithms can effectively analyze real datasets and retrieve meaningful results. … (more)
- Is Part Of:
- Expert systems with applications. Volume 101(2018)
- Journal:
- Expert systems with applications
- Issue:
- Volume 101(2018)
- Issue Display:
- Volume 101, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 101
- Issue:
- 2018
- Issue Sort Value:
- 2018-0101-2018-0000
- Page Start:
- 25
- Page End:
- 42
- Publication Date:
- 2018-07-01
- Subjects:
- Itemset mining -- Fault-tolerant -- Optimization -- Exact algorithm
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2018.02.007 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5892.xml