GAPBAS: Genetic algorithm-based privacy budget allocation strategy in differential privacy K-means clustering algorithm

被引:9
作者
Li, Yong [1 ]
Song, Xiao [1 ]
Tu, Yuchun [1 ]
Liu, Ming [1 ]
机构
[1] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100019, Peoples R China
基金
北京市自然科学基金;
关键词
Differential privacy; Genetic algorithm; Privacy budget allocation; Combinatorial optimization problem; K -means clustering algorithm;
D O I
10.1016/j.cose.2023.103697
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The differential privacy k-means (DP k-means) clustering algorithm emerged to address the privacy protection challenges in the field of data mining. However, the algorithm encounters difficulties in achieving clustering usability and convergence. Privacy budget (epsilon), a critical parameter determining the noise addition in differential privacy algorithms, garners significant attention. Consequently, researchers have shifted their focus to studying privacy budget allocation strategies within the DP k-means clustering algorithm. However, the selection of a privacy budget allocation strategy in the DP k-means algorithm is an NP-hard problem. Our initial intuition is that genetic algorithms can efficiently discover relatively optimal privacy budget sequences. In this context, we propose a genetic algorithm-based privacy budget allocation strategy (GAPBAS) to ensure the convergence and usability of the DP k-means algorithm. Firstly, convergence is ensured by selecting improved initial centroids and rigorously controlling the minimum privacy budget for the DP k-means algorithm. Additionally, the privacy budget allocation strategy of the DP k-means algorithm is reformulated as a combinatorial optimization problem. This entails merging privacy budgets from multiple iterative rounds into a sequential sequence and utilizing a genetic algorithm to select the optimal privacy budget allocation strategy, thereby significantly enhancing the usability of the DP k-means algorithm. Comparative experiments against the other four privacy budget allocation strategies in the DP k-means algorithm demonstrate the superior performance of the genetic algorithm-based privacy budget allocation strategy at the same level of privacy protection.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] K-means Clustering Optimization Algorithm Based on MapReduce
    Li, Zhihua
    Song, Xudong
    Zhu, Wenhui
    Chen, Yanxia
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL SYMPOSIUM ON COMPUTERS & INFORMATICS, 2015, 13 : 198 - 203
  • [22] Density Peak Clustering Algorithm Based on Differential Privacy Preserving
    Chen, Yun
    Du, Yunlan
    Cao, Xiaomei
    [J]. SCIENCE OF CYBER SECURITY, SCISEC 2019, 2019, 11933 : 20 - 32
  • [23] K-Means Clustering With Local dχ-Privacy for Privacy-Preserving Data Analysis
    Yang, Mengmeng
    Tjuawinata, Ivan
    Lam, Kwok-Yan
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2022, 17 : 2524 - 2537
  • [24] On the Security of Distributed Multi-Agent K-Means Clustering With Local Differential Privacy
    Shi, Congcong
    Huang, Xiuli
    Yu, Pengfei
    [J]. IEEE ACCESS, 2024, 12 : 124751 - 124763
  • [25] Genetic TKM: A Hybrid Clustering Method Based on Genetic Algorithm, Tabu Search and K-Means
    Yaghini, Masoud
    Gereilinia, Nasim
    [J]. INTERNATIONAL JOURNAL OF APPLIED METAHEURISTIC COMPUTING, 2013, 4 (01) : 67 - 77
  • [26] A Novel Genetic Algorithm Based k-means Algorithm for Cluster Analysis
    El-Shorbagy, M. A.
    Ayoub, A. Y.
    El-Desoky, I. M.
    Mousa, A. A.
    [J]. INTERNATIONAL CONFERENCE ON ADVANCED MACHINE LEARNING TECHNOLOGIES AND APPLICATIONS (AMLTA2018), 2018, 723 : 92 - 101
  • [27] Based Differential Evolution K-means Algorithm for Fault Clustering on Flight Control System
    Gu Wei
    Zhang Weiguo
    Huang Zhiyi
    Li Lili
    [J]. ISTM/2009: 8TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-6, 2009, : 1586 - 1590
  • [28] Enhancing Stock Prediction Clustering Using K-Means with Genetic Algorithm
    Desokey, Eslam Nader
    Badr, Amr
    Hegazy, Abdel Fatah
    [J]. 2017 13TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), 2017, : 256 - 261
  • [29] A hybrid clustering technique combining a novel genetic algorithm with K-Means
    Rahman, Md Anisur
    Islam, Md Zahidul
    [J]. KNOWLEDGE-BASED SYSTEMS, 2014, 71 : 345 - 365
  • [30] CUDA-based parallel K-means clustering algorithm
    Huo, Yingqiu
    Qin, Renbo
    Xing, Caiyan
    Chen, Xi
    Fang, Yong
    [J]. Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2014, 45 (11): : 47 - 53and74