A Convergent Differentially Private k-Means Clustering Algorithm

被引:13
作者
Lu, Zhigang [1 ]
Shen, Hong [1 ,2 ]
机构
[1] Univ Adelaide, Sch Comp Sci, Adelaide, SA, Australia
[2] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT I | 2019年 / 11439卷
基金
澳大利亚研究理事会; 国家重点研发计划;
关键词
Differential privacy; Adversarial machine learning; k-means clustering;
D O I
10.1007/978-3-030-16148-4_47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Preserving differential privacy (DP) for the iterative clustering algorithms has been extensively studied in the interactive and the non-interactive settings. However, existing interactive differentially private clustering algorithms suffer from a non-convergence problem, i.e., these algorithms may not terminate without a predefined number of iterations. This problem severely impacts the clustering quality and the efficiency of the algorithm. To resolve this problem, we propose a novel iterative approach in the interactive settings which controls the orientation of the centroids movement over the iterations to ensure the convergence by injecting DP noise in a selected area. We prove that, in the expected case, our approach converges to the same centroids as Lloyd's algorithm in at most twice the iterations of Lloyd's algorithm. We perform experimental evaluations on real-world datasets to show that our algorithm outperforms the state-of-the-art of the interactive differentially private clustering algorithms with a guaranteed convergence and better clustering quality to meet the same DP requirement.
引用
收藏
页码:612 / 624
页数:13
相关论文
共 50 条
  • [31] DEM Fusion using a modified k-means clustering algorithm
    Fuss, Colleen E.
    Berg, Aaron A.
    Lindsay, John B.
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2016, 9 (12) : 1242 - 1255
  • [32] An Efficient Hierarchy-Based of K-Means Clustering Algorithm
    Li Yong-peng
    Zhang Bo-tao
    Zhang Shuai-qin
    2008 INTERNATIONAL WORKSHOP ON INFORMATION TECHNOLOGY AND SECURITY, 2008, : 106 - 110
  • [33] NEW ALGORITHM FOR CLUSTERING DISTRIBUTED DATA USING K-MEANS
    Khedr, Ahmed M.
    Bhatnagar, Raj K.
    COMPUTING AND INFORMATICS, 2014, 33 (04) : 943 - 964
  • [34] An Enhanced K-Means Clustering Algorithm for Phishing Attack Detections
    Al-Sabbagh, Abdallah
    Hamze, Khalil
    Khan, Samiya
    Elkhodr, Mahmoud
    ELECTRONICS, 2024, 13 (18)
  • [35] An improved K-means clustering algorithm in agricultural image segmentation
    Cheng, Huifeng
    Peng, Hui
    Liu, Shanmei
    PIAGENG 2013: IMAGE PROCESSING AND PHOTONICS FOR AGRICULTURAL ENGINEERING, 2013, 8761
  • [36] Centronit: Initial Centroid Designation Algorithm for K-Means Clustering
    Barakbah, Ali Ridho
    Arai, Kohei
    EMITTER-INTERNATIONAL JOURNAL OF ENGINEERING TECHNOLOGY, 2014, 2 (01) : 50 - 62
  • [37] An iterative algorithm for optimal variable weighting in K-means clustering
    Zhang, Shaonan
    Li, Shanshan
    Hu, Jiaqiao
    Xing, Haipeng
    Zhu, Wei
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2019, 48 (05) : 1346 - 1365
  • [38] Improved K-means clustering algorithm based on user tag
    Tang J.
    Journal of Convergence Information Technology, 2010, 5 (10) : 124 - 130
  • [39] An Optimal Distributed K-Means Clustering Algorithm Based on CloudStack
    Mao, Yingchi
    Xu, Ziyang
    Ping, Ping
    Wang, Longbao
    2015 NINTH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY FCST 2015, 2015, : 386 - 391
  • [40] An Improved K-Means Clustering Algorithm Based on Semantic Model
    Liu, Zhe
    Bao, Jianmin
    Ding, Fei
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING 2018 (ICITEE '18), 2018,