Survey on Privacy-Preserving Techniques for Microdata Publication

被引:13
作者
Carvalho, Tania [1 ]
Moniz, Nuno [2 ]
Faria, Pedro [3 ]
Antunes, Luis [1 ]
机构
[1] Univ Porto, DCC Fac Sci, Porto, Portugal
[2] Univ Porto, INESC TEC, Porto, Portugal
[3] TekPrivacy, Porto, Portugal
基金
欧盟地平线“2020”;
关键词
Data privacy; microdata; statistical disclosure control; privacy-preserving techniques; predictive performance; DISCLOSURE RISK; PUBLISHING MICRODATA; ANONYMIZED DATA; RECORD-LINKAGE; MICROAGGREGATION; ALGORITHM; CONFIDENTIALITY; METHODOLOGY; INFORMATION; SECURITY;
D O I
10.1145/3588765
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The exponential growth of collected, processed, and shared microdata has given rise to concerns about individuals' privacy. As a result, laws and regulations have emerged to control what organisations do with microdata and how they protect it. Statistical Disclosure Control seeks to reduce the risk of confidential information disclosure by de-identifying them. Such de-identification is guaranteed through privacy-preserving techniques (PPTs). However, de-identified data usually results in loss of information, with a possible impact on data analysis precision and model predictive performance. The main goal is to protect the individual's privacy while maintaining the interpretability of the data (i.e., its usefulness). Statistical Disclosure Control is an area that is expanding and needs to be explored since there is still no solution that guarantees optimal privacy and utility. This survey focuses on all steps of the de-identification process. We present existing PPTs used in microdata de-identification, privacy measures suitable for several disclosure types, and information loss and predictive performance measures. In this survey, we discuss the main challenges raised by privacy constraints, describe the main approaches to handle these obstacles, review the taxonomies of PPTs, provide a theoretical analysis of existing comparative studies, and raise multiple open issues.
引用
收藏
页数:42
相关论文
共 256 条
  • [1] ADAM NR, 1989, COMPUT SURV, V21, P515, DOI 10.1145/76894.76895
  • [2] Aggarwal CC, 2008, ADV DATABASE SYST, V34, P1, DOI 10.1007/978-0-387-70992-5
  • [3] Aircloak GmbH, 2021, AIRCL
  • [4] An efficient approach for publishing microdata for multiple sensitive attributes
    Anjum, Adeel
    Ahmad, Naveed
    Malik, Saif U. R.
    Zubair, Samiya
    Shahzad, Basit
    [J]. JOURNAL OF SUPERCOMPUTING, 2018, 74 (10) : 5127 - 5155
  • [5] [Anonymous], 2013, SDCMICROGUI GRAPHICA
  • [6] [Anonymous], 2005, Journal of Official Statistics
  • [7] [Anonymous], 2001, PREPR ETK NTTS 2001
  • [8] [Anonymous], 1994, Journal of Official Statistics
  • [9] [Anonymous], 2008, P 14 ACM SIGKDD INT, DOI DOI 10.1145/1401890.1401904
  • [10] [Anonymous], 2014, Technical Report