Privacy-Preserving Correlated Data Publication: Privacy Analysis and Optimal Noise Design

被引:7
作者
Sun, Mingjing [1 ,2 ]
Zhao, Chengcheng [3 ,4 ,5 ]
He, Jianping [1 ,2 ]
Cheng, Peng [3 ]
Quevedo, Daniel E. [6 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
[2] Minist Educ China, Key Lab Syst Control & Signal Proc, Shanghai 200240, Peoples R China
[3] Zhejiang Univ, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China
[4] Zhejiang Univ, Inst Cyberspace Res, Hangzhou 310027, Peoples R China
[5] Univ Victoria, Dept Elect & Comp Engn, Victoria, BC V8P 5C2, Canada
[6] Queensland Univ Technol, Sch Elect Engn & Robot, Brisbane, Qld 4000, Australia
来源
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING | 2021年 / 8卷 / 03期
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Correlated data; data privacy; multi-dimension; noise adding mechanism; optimal distribution; DIFFERENTIAL PRIVACY; INFORMATION;
D O I
10.1109/TNSE.2020.3044590
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
The privacy issue in data publication is critical and has been extensively studied. Correlation is unavoidable in data publication, which universally manifests intrinsic correlations owing to social, physical, behavioral, and genetic relationships. However, most of the existing works assume that private data is independent, i.e., the correlation among data is neglected. In this paper, we investigate the privacy concern of data publication where deterministic and probabilistic correlations are considered, respectively. Specifically, (epsilon, delta)-multi-dimensional data-privacy (MDDP) is proposed to quantify the correlated data privacy. It characterizes the disclosure probability of the published data being jointly estimated with the correlation under a given accuracy. Then, we explore the effects of deterministic and probabilistic correlations on privacy disclosure, respectively. For both kinds of correlations, it is shown that the privacy disclosure with correlations increases compared to the one without correlation knowledge. Meanwhile, a closed-form expression of disclosure probability and a strict bound of privacy disclosure gain are derived, respectively. To minimize the disclosure probability, we provide the optimal noise distribution in the sense of (epsilon, delta)-MDDP. Extensive simulations on a real dataset verify our analytical results.
引用
收藏
页码:2014 / 2024
页数:11
相关论文
共 29 条
[1]  
[Anonymous], 2016, P 2016 ACM SIGSAC C, DOI [10.1145/2976749.2978308, DOI 10.1145/2976749.2978308]
[2]  
Calmon FD, 2012, ANN ALLERTON CONF, P1401, DOI 10.1109/Allerton.2012.6483382
[3]   Quantifying Differential Privacy in Continuous Data Release Under Temporal Correlations [J].
Cao, Yang ;
Yoshikawa, Masatoshi ;
Xiao, Yonghui ;
Xiong, Li .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (07) :1281-1295
[4]   Correlated network data publication via differential privacy [J].
Chen, Rui ;
Fung, Benjamin C. M. ;
Yu, Philip S. ;
Desai, Bipin C. .
VLDB JOURNAL, 2014, 23 (04) :653-676
[5]  
Cover TM., 1991, ELEMENTS INFORM THEO
[6]   Calibrating noise to sensitivity in private data analysis [J].
Dwork, Cynthia ;
McSherry, Frank ;
Nissim, Kobbi ;
Smith, Adam .
THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 :265-284
[7]  
Dwork C, 2010, ACM S THEORY COMPUT, P715
[8]  
Farokhi Farhad, 2017, 2017 IEEE 56th Annual Conference on Decision and Control (CDC), P2692, DOI 10.1109/CDC.2017.8264050
[9]   Ensuring privacy with constrained additive noise by minimizing Fisher information [J].
Farokhi, Farhad ;
Sandberg, Henrik .
AUTOMATICA, 2019, 99 :275-288
[10]  
Fujisaki E., 1999, Advances in Cryptology - CRYPTO'99. 19th Annual International Cryptology Conference. Proceedings, P537