A general framework for privacy-preserving of data publication based on randomized response techniques. Issue 96 (February 2021)
- Record Type:
- Journal Article
- Title:
- A general framework for privacy-preserving of data publication based on randomized response techniques. Issue 96 (February 2021)
- Main Title:
- A general framework for privacy-preserving of data publication based on randomized response techniques
- Authors:
- Liu, Chaobin
Chen, Shixi
Zhou, Shuigeng
Guan, Jihong
Ma, Yao - Abstract:
- Abstract: Privacy preserving is a paramount concern in publishing datasets that contain sensitive information. Preventing privacy disclosure and providing useful information to legitimate users for data analyzing/mining are conflicting goals. Randomized response is a class of techniques that perturbs each sensitive value in a certain way, so that personal privacy is protected while the large-trend of the entire dataset is still recoverable. However, existing randomized response techniques do not allow to flexibly configure the level of privacy protection, support only a few types of aggregate queries, and cannot achieve the best answer accuracy from perturbed data. These drawbacks impair the effectiveness of those techniques. This paper proposes a general framework based on randomized response techniques, which has good flexibility and extensibility, and can improve the effectiveness of randomized response methods. Our approach is validated by extensive experiments and comparison with existing randomized response and generalization methods. Highlights: A general framework for data publication is proposed based on randomized response techniques, which, by utilizing matrix decomposition method and properties of Kronecker product, can reduce the computational complexity of reconstructing unbiased estimate answers from exponential correlation to linear correlation. A general approach for constructing a recovery matrix from arbitrary perturbation matrix is proposed, which canAbstract: Privacy preserving is a paramount concern in publishing datasets that contain sensitive information. Preventing privacy disclosure and providing useful information to legitimate users for data analyzing/mining are conflicting goals. Randomized response is a class of techniques that perturbs each sensitive value in a certain way, so that personal privacy is protected while the large-trend of the entire dataset is still recoverable. However, existing randomized response techniques do not allow to flexibly configure the level of privacy protection, support only a few types of aggregate queries, and cannot achieve the best answer accuracy from perturbed data. These drawbacks impair the effectiveness of those techniques. This paper proposes a general framework based on randomized response techniques, which has good flexibility and extensibility, and can improve the effectiveness of randomized response methods. Our approach is validated by extensive experiments and comparison with existing randomized response and generalization methods. Highlights: A general framework for data publication is proposed based on randomized response techniques, which, by utilizing matrix decomposition method and properties of Kronecker product, can reduce the computational complexity of reconstructing unbiased estimate answers from exponential correlation to linear correlation. A general approach for constructing a recovery matrix from arbitrary perturbation matrix is proposed, which can minimize the variance of unbiased estimate answers. Perturbation and reconstruction algorithms for boolean and categorical attributes are developed, which can be extended to numerical attributes. Both theoretical analysis and experimental results are given, which validate the proposed approach. … (more)
- Is Part Of:
- Information systems. Issue 96(2021)
- Journal:
- Information systems
- Issue:
- Issue 96(2021)
- Issue Display:
- Volume 96, Issue 96 (2021)
- Year:
- 2021
- Volume:
- 96
- Issue:
- 96
- Issue Sort Value:
- 2021-0096-0096-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-02
- Subjects:
- Privacy preserving -- Randomized response -- Data publishing
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2020.101648 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15001.xml