A regression-based algorithm for frequent itemsets mining. (5th September 2019)
- Record Type:
- Journal Article
- Title:
- A regression-based algorithm for frequent itemsets mining. (5th September 2019)
- Main Title:
- A regression-based algorithm for frequent itemsets mining
- Authors:
- Jia, Zirui
Wang, Zengli - Abstract:
- Abstract : Purpose: Frequent itemset mining (FIM) is a basic topic in data mining. Most FIM methods build itemset database containing all possible itemsets, and use predefined thresholds to determine whether an itemset is frequent. However, the algorithm has some deficiencies. It is more fit for discrete data rather than ordinal/continuous data, which may result in computational redundancy, and some of the results are difficult to be interpreted. The purpose of this paper is to shed light on this gap by proposing a new data mining method. Design/methodology/approach: Regression pattern (RP) model will be introduced, in which the regression model and FIM method will be combined to solve the existing problems. Using a survey data of computer technology and software professional qualification examination, the multiple linear regression model is selected to mine associations between items. Findings: Some interesting associations mined by the proposed algorithm and the results show that the proposed method can be applied in ordinal/continuous data mining area. The experiment of RP model shows that, compared to FIM, the computational redundancy decreased and the results contain more information. Research limitations/implications: The proposed algorithm is designed for ordinal/continuous data and is expected to provide inspiration for data stream mining and unstructured data mining. Practical implications: Compared to FIM, which mines associations between discrete items, RP modelAbstract : Purpose: Frequent itemset mining (FIM) is a basic topic in data mining. Most FIM methods build itemset database containing all possible itemsets, and use predefined thresholds to determine whether an itemset is frequent. However, the algorithm has some deficiencies. It is more fit for discrete data rather than ordinal/continuous data, which may result in computational redundancy, and some of the results are difficult to be interpreted. The purpose of this paper is to shed light on this gap by proposing a new data mining method. Design/methodology/approach: Regression pattern (RP) model will be introduced, in which the regression model and FIM method will be combined to solve the existing problems. Using a survey data of computer technology and software professional qualification examination, the multiple linear regression model is selected to mine associations between items. Findings: Some interesting associations mined by the proposed algorithm and the results show that the proposed method can be applied in ordinal/continuous data mining area. The experiment of RP model shows that, compared to FIM, the computational redundancy decreased and the results contain more information. Research limitations/implications: The proposed algorithm is designed for ordinal/continuous data and is expected to provide inspiration for data stream mining and unstructured data mining. Practical implications: Compared to FIM, which mines associations between discrete items, RP model could mine associations between ordinal/continuous data sets. Importantly, RP model performs well in saving computational resource and mining meaningful associations. Originality/value: The proposed algorithms provide a novelty view to define and mine association. … (more)
- Is Part Of:
- Data technologies and applications. Volume 54:Number 3(2020)
- Journal:
- Data technologies and applications
- Issue:
- Volume 54:Number 3(2020)
- Issue Display:
- Volume 54, Issue 3 (2020)
- Year:
- 2020
- Volume:
- 54
- Issue:
- 3
- Issue Sort Value:
- 2020-0054-0003-0000
- Page Start:
- 259
- Page End:
- 273
- Publication Date:
- 2019-09-05
- Subjects:
- Data mining -- Questionnaire -- Software industry -- Regression -- Frequent itemset mining -- Ordinal/continuous data
Information science -- Periodicals
Electronic information resources -- Periodicals
Knowledge management -- Periodicals
020.5 - Journal URLs:
- http://www.emeraldinsight.com/loi/dta ↗
http://www.emeraldinsight.com/ ↗ - DOI:
- 10.1108/DTA-03-2019-0037 ↗
- Languages:
- English
- ISSNs:
- 2514-9288
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13338.xml