GAPBAS: Genetic algorithm-based privacy budget allocation strategy in differential privacy K-means clustering algorithm

被引:9
作者
Li, Yong [1 ]
Song, Xiao [1 ]
Tu, Yuchun [1 ]
Liu, Ming [1 ]
机构
[1] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100019, Peoples R China
基金
北京市自然科学基金;
关键词
Differential privacy; Genetic algorithm; Privacy budget allocation; Combinatorial optimization problem; K -means clustering algorithm;
D O I
10.1016/j.cose.2023.103697
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The differential privacy k-means (DP k-means) clustering algorithm emerged to address the privacy protection challenges in the field of data mining. However, the algorithm encounters difficulties in achieving clustering usability and convergence. Privacy budget (epsilon), a critical parameter determining the noise addition in differential privacy algorithms, garners significant attention. Consequently, researchers have shifted their focus to studying privacy budget allocation strategies within the DP k-means clustering algorithm. However, the selection of a privacy budget allocation strategy in the DP k-means algorithm is an NP-hard problem. Our initial intuition is that genetic algorithms can efficiently discover relatively optimal privacy budget sequences. In this context, we propose a genetic algorithm-based privacy budget allocation strategy (GAPBAS) to ensure the convergence and usability of the DP k-means algorithm. Firstly, convergence is ensured by selecting improved initial centroids and rigorously controlling the minimum privacy budget for the DP k-means algorithm. Additionally, the privacy budget allocation strategy of the DP k-means algorithm is reformulated as a combinatorial optimization problem. This entails merging privacy budgets from multiple iterative rounds into a sequential sequence and utilizing a genetic algorithm to select the optimal privacy budget allocation strategy, thereby significantly enhancing the usability of the DP k-means algorithm. Comparative experiments against the other four privacy budget allocation strategies in the DP k-means algorithm demonstrate the superior performance of the genetic algorithm-based privacy budget allocation strategy at the same level of privacy protection.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] An Improved Differential Privacy K-means Algorithm Based on MapReduce
    Yao, Shunyuan
    2018 11TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2018, : 141 - 145
  • [2] RETRACTED: CVDP k-means clustering algorithm for differential privacy based on coefficient of variation (Retracted Article)
    Kong, Yuting
    Qian, Yurong
    Tan, Fuxiang
    Bai, Lu
    Shao, Jinxin
    Ma, Tinghuai
    Tereshchenko, Sergei Nikolayevich
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 43 (05) : 6027 - 6045
  • [3] On K-means Data Clustering Algorithm with Genetic Algorithm
    Kapil, Shruti
    Chawla, Meenu
    Ansari, Mohd Dilshad
    2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 202 - 206
  • [4] Random Forest Algorithm Based on Linear Privacy Budget Allocation
    Dong, Yanling
    Zhang, Shufen
    Xu, Jingcheng
    Wang, Haoshi
    Liu, Jiqiang
    JOURNAL OF DATABASE MANAGEMENT, 2022, 33 (02)
  • [5] Distributed K-Means clustering guaranteeing local differential privacy
    Xia, Chang
    Hua, Jingyu
    Tong, Wei
    Zhong, Sheng
    COMPUTERS & SECURITY, 2020, 90
  • [6] Genetic Algorithm-Based Privacy Preserving Collaborative Filtering
    Birgin, Mustafa Kemal
    Bilge, Alper
    2019 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2019, : 89 - 93
  • [7] Investigation of Strawberry Irrigation Strategy Based on K-means Clustering Algorithm
    Li L.
    Wang H.
    Wu Y.
    Chen S.
    Wang H.
    Sigrimis N.A.
    Wang, Haihua (whaihua@cau.edu.cn), 1600, Chinese Society of Agricultural Machinery (51): : 295 - 302
  • [8] Representing the New Model for Improving K-Means Clustering Algorithm based on Genetic Algorithm
    Maghsoudi, Rouhollah
    Delavar, Arash Ghorbannia
    Hoseyny, Somayye
    Asgari, Rahmatollah
    Heidari, Yaghub
    JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2011, 2 (02): : 329 - 336
  • [9] Optimization of K-Means clustering Using Genetic Algorithm
    Irfan, Shadab
    Dwivedi, Gaurav
    Ghosh, Subhajit
    2017 INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES FOR SMART NATION (IC3TSN), 2017, : 157 - 162
  • [10] Weighted K-means Clustering Analysis Based on Improved Genetic Algorithm
    Zhang, Tongjie
    Cao, Yan
    Mu, Xiangwei
    SENSORS, MECHATRONICS AND AUTOMATION, 2014, 511-512 : 904 - 908