Privacy preserving frequent itemset mining: Maximizing data utility based on database reconstruction. Issue 84 (July 2019)
- Record Type:
- Journal Article
- Title:
- Privacy preserving frequent itemset mining: Maximizing data utility based on database reconstruction. Issue 84 (July 2019)
- Main Title:
- Privacy preserving frequent itemset mining: Maximizing data utility based on database reconstruction
- Authors:
- Li, Shaoxin
Mu, Nankun
Le, Junqing
Liao, Xiaofeng - Abstract:
- Abstract: The process of frequent itemset mining (FIM) within large-scale databases plays a significant part in many knowledge discovery tasks, where, however, potential privacy breaches are possible. Privacy preserving frequent itemset mining (PPFIM) has thus drawn increasing attention recently, where the ultimate goal is to hide sensitive frequent itemsets (SFIs) so as to leave no confidential knowledge uncovered in the resulting database. Nevertheless, the vast majority of the proposed methods for PPFIM were merely based on database perturbation, which may result in a significant loss of data utility in order to conceal all SFIs. To alleviate this issue, this paper proposes a database reconstruction-based algorithm for PPFIM (DR-PPFIM) that can not only achieve a high degree of privacy but also afford a reasonable data utility. In DR-PPFIM, all SFIs with related frequent itemsets are first identified for removing in the pre-sanitize process by implementing a devised sanitize method. With the remained frequent itemsets, a novel database reconstruction scheme is proposed to reconstruct an appropriate database, where the concepts of inverse frequent itemset mining (IFIM) and database extension are efficiently integrated. In this way, all SFIs are able to be hidden under the same mining threshold while maximizing the data utility of the synthetic database as much as possible. Moreover, we also develop a further hiding strategy in DR-PPFIM to further decrease the significanceAbstract: The process of frequent itemset mining (FIM) within large-scale databases plays a significant part in many knowledge discovery tasks, where, however, potential privacy breaches are possible. Privacy preserving frequent itemset mining (PPFIM) has thus drawn increasing attention recently, where the ultimate goal is to hide sensitive frequent itemsets (SFIs) so as to leave no confidential knowledge uncovered in the resulting database. Nevertheless, the vast majority of the proposed methods for PPFIM were merely based on database perturbation, which may result in a significant loss of data utility in order to conceal all SFIs. To alleviate this issue, this paper proposes a database reconstruction-based algorithm for PPFIM (DR-PPFIM) that can not only achieve a high degree of privacy but also afford a reasonable data utility. In DR-PPFIM, all SFIs with related frequent itemsets are first identified for removing in the pre-sanitize process by implementing a devised sanitize method. With the remained frequent itemsets, a novel database reconstruction scheme is proposed to reconstruct an appropriate database, where the concepts of inverse frequent itemset mining (IFIM) and database extension are efficiently integrated. In this way, all SFIs are able to be hidden under the same mining threshold while maximizing the data utility of the synthetic database as much as possible. Moreover, we also develop a further hiding strategy in DR-PPFIM to further decrease the significance of SFIs with the purpose of reducing the risk of disclosing confidential knowledge. Extensive comparative experiments are conducted on real databases to demonstrate the superiority of DR-PPFIM in terms of maximizing the utility of data and resisting potential threats. … (more)
- Is Part Of:
- Computers & security. Issue 84(2019)
- Journal:
- Computers & security
- Issue:
- Issue 84(2019)
- Issue Display:
- Volume 84, Issue 84 (2019)
- Year:
- 2019
- Volume:
- 84
- Issue:
- 84
- Issue Sort Value:
- 2019-0084-0084-0000
- Page Start:
- 17
- Page End:
- 34
- Publication Date:
- 2019-07
- Subjects:
- Privacy preserving data mining -- Frequent itemset -- Database reconstruction -- Inverse frequent itemset mining -- Database extension
Computer security -- Periodicals
Electronic data processing departments -- Security measures -- Periodicals
005.805 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01674048 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cose.2019.03.008 ↗
- Languages:
- English
- ISSNs:
- 0167-4048
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.781000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 10605.xml