Sampling Based Fast Publishing Algorithm with Differential Privacy for Data Stream

被引:0
作者
Wang, Xiujun [1 ,2 ]
Mo, Lei [3 ]
Zheng, Xiao [1 ,2 ]
Wei, Linna [1 ,2 ]
Dong, Jun [4 ]
Liu, Zhi [5 ]
Guo, Longkun [3 ]
机构
[1] School of Computer Science and Technology, Anhui University of Technology, Anhui, Ma’anshan
[2] Anhui Engineering Research Center for Intelligent Applications and Security of Industrial Internet, Anhui University of Technology, Anhui, Ma'anshan
[3] School of Mathematics, Fuzhou University, Fuzhou
[4] Institute of Intelligent Machines, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei
[5] The University of Electro-Communications, Tokyo
来源
Jisuanji Yanjiu yu Fazhan/Computer Research and Development | 2024年 / 61卷 / 10期
关键词
cloud native database; data publication; data sampling; data stream; differential privacy; sliding window;
D O I
10.7544/issn1000-1239.202440481
中图分类号
学科分类号
摘要
Many cloud native database applications need to handle massive data streams. To analyze group trend information in these data streams in real time without compromising individual user privacy, these applications require the capability to quickly create differentially private histograms for the most recent dataset at any given moment. However, existing histogram publishing methods lack efficient data structures, making it difficult to rapidly extract key information to ensure real-time data usability. To address this issue, we deeply analyze the relationship between data sampling and privacy protection, and propose a sampling based fast publishing algorithm with differential privacy for data stream (SPF). SPF introduces an efficient data stream sampling sketch structure (EDS) for the first time, which samples and statistically estimates data within a sliding window and filters out unreasonable data, enabling rapid extraction of key information. Then, we demonstrate that the approximations output by the EDS structure are theoretically equivalent to adding differential privacy noise to the true values. Finally, to meet the privacy protection strength provided by the user while reflecting the true situation of the original data stream, an adaptive noise addition algorithm based on efficient data stream sampling is proposed. According to the relationship between the user-provided privacy protection strength and the privacy protection strength provided by the EDS structure, the algorithm adaptively generates the final publishable histogram through privacy allocation. Experiments show that compared with existing algorithms, SPF significantly reduces time and space overhead while maintaining the same data usability. © 2024 Science Press. All rights reserved.
引用
收藏
页码:2433 / 2447
页数:14
相关论文
共 39 条
  • [1] Dong Haowen, Zhang Chao, Li Guoliang, Et al., Survey on cloudnative databases, Journal of Software, 35, 2, pp. 899-926, (2023)
  • [2] Zhao Zhanhao, Pan Hexiang, Chen Gang, Et al., VeriTxn: Verifiable transactions for cloud-native databases with storage disaggregation[J], Proceedings of the ACM on Management of Data, 1, 4, pp. 1-27, (2023)
  • [3] Papadogiannaki E, Ioannidis S., A survey on encrypted network traffic analysis applications, techniques, and countermeasures[J], ACM Computing Surveys, 54, 6, pp. 1-35, (2021)
  • [4] Shahraki A, Taherkordi A, Haugen O., TONTA: Trend-based online network traffic analysis in ad-hoc IoT networks, Computer Networks, 194, (2021)
  • [5] Shahid M R, Blanc G, Zhang Z, Et al., IoT devices recognition through network traffic analysis[C], Proc of 2018 IEEE Int Conf on Big Data (Big Data), pp. 5187-5192, (2018)
  • [6] Butila E V, Boboc R G., Urban traffic monitoring and analysis using unmanned aerial vehicles (UAVs): A systematic literature review, Remote Sensing, 14, 3, (2022)
  • [7] Jain N K, Saini R K, Mittal P., A review on traffic monitoring system techniques. Soft Computing: Theories and Applications[C], Proc of SOCTA, 2019, pp. 569-577, (2017)
  • [8] Figueiras P, Herga Z, Guerreiro G, Et al., Real-time monitoring of road traffic using data stream mining[C], Proc of 2018 IEEE Int Conf on Engineering, Technology and Innovation (ICE/ITMC), pp. 1-8, (2018)
  • [9] Fang B, Zhang P., Big data in finance[J], Big Data Concepts, Theories, and Applications, pp. 391-412, (2016)
  • [10] Fikri N, Rida M, Abghour N, Et al., An adaptive and real-time based architecture for financial data integration[J], Journal of Big Data, 6, pp. 1-25, (2019)