Locally Private Set-Valued Data Analyses: Distribution and Heavy Hitters Estimation

被引:6
作者
Wang, Shaowei [1 ]
Li, Yuntong [1 ]
Zhong, Yusen [1 ]
Chen, Kongyang [1 ]
Wang, Xianmin [1 ]
Zhou, Zhili [1 ]
Peng, Fei [1 ]
Qian, Yuqiu [2 ]
Du, Jiachun [2 ]
Yang, Wei [3 ]
机构
[1] Guangzhou Univ, Inst Artificial Intelligence & Blockchain, Guangzhou 511370, Guangdong, Peoples R China
[2] Tencent Games, Interact Entertainment Grp, Shenzhen 518054, Guangdong, Peoples R China
[3] Univ Sci & Technol China, Inst Adv Res, Hefei 230052, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Estimation; Wheels; Protocols; Servers; Differential privacy; Costs; Distributed databases; Distributed data aggregation; frequency estimation; heavy-hitter identification; local differential privacy;
D O I
10.1109/TMC.2023.3342056
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many mobile applications, user-generated data are presented as set-valued data. To tackle potential privacy threats in analyzing these valuable data, local differential privacy has been attracting substantial attention. However, existing approaches only provide sub-optimal utility and are expensive in computation and communication for set-valued data distribution estimation and heavy-hitter identification. In this paper, we propose a utility-optimal and efficient set-valued data publication method (i.e., Wheel mechanism). On the user side, the computational complexity is only O(min{m log m, me(is an element of)}) and communication costs are O(is an element of+ log m) bits, where m is the number of items, d is the domain size and epsilon is the privacy budget, while existing approaches usually depend on O(d) or O(log d)(d >> m). Our theoretical analyses reveal the estimation errors have been reduced from the previously known O(m(2)d/n is an element of(2)) to the optimal rate O(md/n is an element of(2)). Additionally, for heavy-hitter identification, we present a variant of the Wheel mechanism as an efficient frequency oracle, entailing only O(root n) computational complexity. This heavy-hitter protocol achieves an identification bar of O(1/is an element of root m/n log d),, reducing by a factor of O(root m) relative to existing protocols. Extensive experiments demonstrate our methods are 3-100x faster than existing approaches and have optimized statistical efficiency.
引用
收藏
页码:8050 / 8065
页数:16
相关论文
共 48 条
[1]  
[Anonymous], 2003, ACM KDD
[2]   Privacy and Security in Internet of Things and Wearable Devices [J].
Arias, Orlando ;
Wurm, Jacob ;
Khoa Hoang ;
Jin, Yier .
IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS, 2015, 1 (02) :99-109
[3]  
Bassily R., 2017, Proc. NeurIPS, P2288
[4]   Local, Private, Efficient Protocols for Succinct Histograms [J].
Bassily, Raef ;
Smith, Adam .
STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, :127-135
[5]   Lightweight Techniques for Private Heavy Hitters [J].
Boneh, Dan ;
Boyle, Elette ;
Corrigan-Gibbs, Henry ;
Gilboa, Niv ;
Ishai, Yuval .
2021 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP, 2021, :762-776
[6]   Heavy Hitters and the Structure of Local Privacy [J].
Bun, Mark ;
Nelson, Jelani ;
Stemmer, Uri .
ACM TRANSACTIONS ON ALGORITHMS, 2019, 15 (04)
[7]  
Cambridge U.K., 2017, Probability and Computing: Random-ization and Probabilistic Techniques in Algorithms and Data Analysis
[8]  
Chen R, 2011, PROC VLDB ENDOW, V4, P1087
[9]  
Cormode G., 2012, ICDT 2012, P299
[10]  
Dachman-Soled D, 2009, LECT NOTES COMPUT SC, V5536, P125, DOI 10.1007/978-3-642-01957-9_8