Anonymization: The imperfect science of using data while preserving privacy

被引:11
作者
Gadotti, Andrea [1 ,2 ]
Rocher, Luc [1 ,2 ]
Houssiau, Florimond [1 ,3 ]
Cretu, Ana-Maria [1 ,4 ]
de Montjoye, Yves-Alexandre [1 ]
机构
[1] Imperial Coll London, Exhibit Rd, London SW7 2AZ, England
[2] Univ Oxford, Wellington Sq, Oxford OX1 2JD, England
[3] Alan Turing Inst, 96 Euston Rd, London NW1 2DB, England
[4] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
来源
SCIENCE ADVANCES | 2024年 / 10卷 / 29期
基金
英国科研创新办公室;
关键词
DIFFERENTIAL PRIVACY; DE-ANONYMIZATION; ATTACKS; REIDENTIFICATION; MEMBERSHIP; RECONSTRUCTION; INFORMATION; SENSITIVITY; RISK; US;
D O I
10.1126/sciadv.adn7053
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Information about us, our actions, and our preferences is created at scale through surveys or scientific studies or as a result of our interaction with digital devices such as smartphones and fitness trackers. The ability to safely share and analyze such data is key for scientific and societal progress. Anonymization is considered by scientists and policy-makers as one of the main ways to share data while minimizing privacy risks. In this review, we offer a pragmatic perspective on the modern literature on privacy attacks and anonymization techniques. We discuss traditional de-identification techniques and their strong limitations in the age of big data. We then turn our attention to modern approaches to share anonymous aggregate data, such as data query systems, synthetic data, and differential privacy. We find that, although no perfect solution exists, applying modern techniques while auditing their guarantees against attacks is the best approach to safely use and share data today. Safe anonymization and sharing of personal data will be achieved by combining formal privacy methods with red-teaming.
引用
收藏
页数:22
相关论文
共 263 条
  • [1] Deep Learning with Differential Privacy
    Abadi, Martin
    Chu, Andy
    Goodfellow, Ian
    McMahan, H. Brendan
    Mironov, Ilya
    Talwar, Kunal
    Zhang, Li
    [J]. CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, : 308 - 318
  • [2] Abowd J., 2022, Harv. Data Sci. Rev.
  • [3] Abowd J. M., 2021, Defendants' response in opposition to plaintiffs' motion for preliminary injunction and petition for writ of mandamus-Declaration of John M. Abowd
  • [4] Abowd JM, 2023, Arxiv, DOI arXiv:2312.11283
  • [5] Achara JP., 2015, P 14 ACM WORKSH PRIV, P27
  • [6] Aggarwal C. C., 2005, P 31 INT C VER LARG
  • [7] Aggarwal CC, 2007, PROC INT CONF DATA, P111
  • [8] Aktay A, 2020, Arxiv, DOI [arXiv:2004.04145, 10.48550/arXiv.2004.04145]
  • [9] Altman M., 2022, A principled approach to defining anonymization as applied to EU data protection law
  • [10] Annamalai M. S. M. S., 2024, 33 USENIX SEC S USEN