Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks. (20th February 2017)
- Record Type:
- Journal Article
- Title:
- Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks. (20th February 2017)
- Main Title:
- Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks
- Authors:
- Raisaro, Jean Louis
Tramèr, Florian
Ji, Zhanglong
Bu, Diyue
Zhao, Yongan
Carey, Knox
Lloyd, David
Sofia, Heidi
Baker, Dixie
Flicek, Paul
Shringarpure, Suyash
Bustamante, Carlos
Wang, Shuang
Jiang, Xiaoqian
Ohno-Machado, Lucila
Tang, Haixu
Wang, XiaoFeng
Hubaux, Jean-Pierre - Abstract:
- Abstract: The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context—a query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or "beacon") is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains and in accordance with the GA4GH policy and standards. While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable. However, recent work demonstrated that, given a beacon with specific characteristics (including relatively small sample size and an adversary who possesses an individual's whole genome sequence), the individual's membership in a beacon can be inferred through repeated queries for variants present in the individual's genome. In this paper, we propose three practical strategies for reducing re-identification risks in beacons. The first two strategies manipulate the beacon such that the presence of rare alleles is obscured; the third strategy budgets the number of accesses per user for each individual genome. Using a beacon containing data from theAbstract: The Global Alliance for Genomics and Health (GA4GH) created the Beacon Project as a means of testing the willingness of data holders to share genetic data in the simplest technical context—a query for the presence of a specified nucleotide at a given position within a chromosome. Each participating site (or "beacon") is responsible for assuring that genomic data are exposed through the Beacon service only with the permission of the individual to whom the data pertains and in accordance with the GA4GH policy and standards. While recognizing the inference risks associated with large-scale data aggregation, and the fact that some beacons contain sensitive phenotypic associations that increase privacy risk, the GA4GH adjudged the risk of re-identification based on the binary yes/no allele-presence query responses as acceptable. However, recent work demonstrated that, given a beacon with specific characteristics (including relatively small sample size and an adversary who possesses an individual's whole genome sequence), the individual's membership in a beacon can be inferred through repeated queries for variants present in the individual's genome. In this paper, we propose three practical strategies for reducing re-identification risks in beacons. The first two strategies manipulate the beacon such that the presence of rare alleles is obscured; the third strategy budgets the number of accesses per user for each individual genome. Using a beacon containing data from the 1000 Genomes Project, we demonstrate that the proposed strategies can effectively reduce re-identification risk in beacon-like datasets. … (more)
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 24:Number 4(2017:Jul.)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 24:Number 4(2017:Jul.)
- Issue Display:
- Volume 24, Issue 4 (2017)
- Year:
- 2017
- Volume:
- 24
- Issue:
- 4
- Issue Sort Value:
- 2017-0024-0004-0000
- Page Start:
- 799
- Page End:
- 805
- Publication Date:
- 2017-02-20
- Subjects:
- genomic privacy -- ga4gh -- beacon -- re-identification -- genomic data sharing
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocw167 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15137.xml