PrivTDSI: A Local Differentially Private Approach for Truth Discovery via Sampling and Inference

被引:8
作者
Zhang, Pengfei [1 ]
Cheng, Xiang [1 ]
Su, Sen [1 ]
Zhu, Binyuan [1 ]
机构
[1] Beijing Univ Posts ands Telecommun, State Key Lab Networking & Switching Tech, Beijing 100876, Peoples R China
基金
中国国家自然科学基金;
关键词
Task analysis; Privacy; Servers; Reliability; Crowdsourcing; Protocols; Noise measurement; truth discovery; local differential privacy; privacy protection; sampling and inference; CROWDSOURCED DATA; AWARE; NOISE;
D O I
10.1109/TBDATA.2022.3186175
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Truth discovery is an effective way to identify the aggregated truth of each task among multiple observed data drawn from different workers of varying reliabilities. However, existing studies are insufficient to protect individuals' privacy, as they either just guarantee the weaker versions of local differential privacy (LDP) or potentially assume that the tasks are independent. In this paper, we, for the first time, investigate the problem of truth discovery while achieving the rigorous LDP for each worker with continuous inputs without the independence assumption. We present a locally differentially private truth discovery approach called PrivTDSI based on sampling and inference with solid privacy and utility guarantees. In PrivTDSI, the server first determines which values of each worker should be sampled according to a sample proportion and sends the indexes of these values to each worker. Then, each worker adds noise into the sampled values for privacy protection and uploads them to the server. After receiving the noisy sampled values from all the workers, the server first infers the unsampled values and then conducts truth discovery based on both the noisy sampled values and the inferred values. In particular, to determine the sample proportion, we formulate a constrained nonlinear programming problem and give a closed-form solution to this problem. Moreover, to determine which values of each worker should be sampled while avoiding the situation where the values of some workers or tasks might not be sampled at all, we develop a two-stage sampling method called TOSS. Furthermore, to infer the unsampled values accurately, we design a quality-aware inference method based on matrix factorization called QualityMF. Experimental results on two real-world datasets and a synthetic dataset demonstrate the effectiveness of PrivTDSI.
引用
收藏
页码:471 / 484
页数:14
相关论文
共 56 条
[1]   Inference attacks against differentially private query results from genomic datasets including dependent tuples [J].
Almadhoun, Nour ;
Ayday, Erman ;
Ulusoy, Ozgur .
BIOINFORMATICS, 2020, 36 :136-145
[2]  
Bachrach Y., 2012, P 29 INT C MACH LEAR, P819
[3]  
Castagnos G, 2007, LECT NOTES COMPUT SC, V4779, P362
[4]   Correlated Differential Privacy Protection for Mobile Crowdsensing [J].
Chen, Jianwei ;
Ma, Huadong ;
Zhao, Dong ;
Liu, Liang .
IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (04) :784-795
[5]  
Ding BL, 2017, ADV NEUR IN, V30
[6]   Local Privacy and Statistical Minimax Rates [J].
Duchi, John C. ;
Jordan, Michael I. ;
Wainwright, Martin J. .
2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, :429-438
[7]  
Dwork C, 2006, INT C AUT LANG PROGR, P1
[8]  
Dwork C, 2006, LECT NOTES COMPUT SC, V4004, P486
[9]   Calibrating noise to sensitivity in private data analysis [J].
Dwork, Cynthia ;
McSherry, Frank ;
Nissim, Kobbi ;
Smith, Adam .
THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 :265-284
[10]  
Gao Jingsheng, 2020, PROC C COMPUT COMMUN, P1