Privacy-preserving Density-based Clustering

被引:10
|
作者
Bozdemir, Beyza [1 ]
Canard, Sebastien [2 ]
Ermis, Orhan [1 ]
Moellering, Helen [3 ]
Onen, Melek [1 ]
Schneider, Thomas [3 ]
机构
[1] EURECOM, Sophia Antipolis, France
[2] Orange Labs, Appl Crypto Grp, Caen, France
[3] Tech Univ Darmstadt, Darmstadt, Germany
基金
欧洲研究理事会; 欧盟地平线“2020”;
关键词
Private Machine Learning; Clustering; Secure Computation;
D O I
10.1145/3433210.3453104
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is an unsupervised machine learning technique that outputs clusters containing similar data items. In this work, we investigate privacy-preserving density-based clustering which is, for example, used in financial analytics and medical diagnosis. When (multiple) data owners collaborate or outsource the computation, privacy concerns arise. To address this problem, we design, implement, and evaluate the first practical and fully private density-based clustering scheme based on secure two-party computation. Our protocol privately executes the DBSCAN algorithm without disclosing any information (including the number and size of clusters). It can be used for private clustering between two parties as well as for private outsourcing of an arbitrary number of data owners to two non-colluding servers. Our implementation of the DBSCAN algorithm privately clusters data sets with 400 elements in 7 minutes on commodity hardware. Thereby, it flexibly determines the number of required clusters and is insensitive to outliers, while being only factor 19x slower than today's fastest private K-means protocol (Mohassel et al., PETS'20) which can only be used for specific data sets. We then show how to transfer our newly designed protocol to related clustering algorithms by introducing a private approximation of the TRACLUS algorithm for trajectory clustering which has interesting real-world applications like financial time series forecasts and the investigation of the spread of a disease like COVID-19.
引用
收藏
页码:658 / 671
页数:14
相关论文
共 50 条
  • [41] Maximized Privacy-Preserving Outsourcing on Support Vector Clustering
    Ping, Yuan
    Hao, Bin
    Hei, Xiali
    Wu, Jie
    Wang, Baocang
    ELECTRONICS, 2020, 9 (01)
  • [42] Privacy-Preserving Clustering Using C-Means
    Vashkevich, Alexey V.
    Zhukov, Vagim G.
    2015 INTERNATIONAL SIBERIAN CONFERENCE ON CONTROL AND COMMUNICATIONS (SIBCON), 2015,
  • [43] Privacy-preserving patient clustering for personalized federated learning
    Elhussein, Ahmed
    Gursoy, Gamze
    MACHINE LEARNING FOR HEALTHCARE CONFERENCE, VOL 219, 2023, 219
  • [44] Privacy-Preserving Distributed Clustering for Electrical Load Profiling
    Jia, Mengshuo
    Wang, Yi
    Shen, Chen
    Hug, Gabriela
    IEEE TRANSACTIONS ON SMART GRID, 2021, 12 (02) : 1429 - 1444
  • [45] Privacy-preserving clustering with distributed EM mixture modeling
    Xiaodong Lin
    Chris Clifton
    Michael Zhu
    Knowledge and Information Systems, 2005, 8 : 68 - 81
  • [46] Privacy-Preserving Data Mining in Homogeneous Collaborative Clustering
    Ouda, Mohamed
    Salem, Sameh
    Ali, Ihab
    Saad, El-Sayed
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2015, 12 (06) : 604 - 612
  • [47] Secure and Evaluable Clustering Based on a Multifunctional and Privacy-Preserving Outsourcing Computation Toolkit
    Li, Jialin
    Lu, Penghao
    Lin, Xuemin
    IEEE ACCESS, 2022, 10 : 39407 - 39423
  • [48] A Cloud-based Secure and Privacy-Preserving Clustering Analysis of Infectious Disease
    Liu, Jianqing
    Hu, Yaodan
    Yue, Hao
    Gong, Yanmin
    Fang, Yuguang
    2018 IEEE SYMPOSIUM ON PRIVACY-AWARE COMPUTING (PAC), 2018, : 107 - 116
  • [49] Location- and Relation-Based Clustering on Privacy-Preserving Social Networks
    Yin, Dan
    Shen, Yiran
    TSINGHUA SCIENCE AND TECHNOLOGY, 2018, 23 (04) : 453 - 462
  • [50] Fully Privacy-Preserving and Efficient Clustering Scheme based on Fully Homomorphic Encryption
    Zhang, Mengyu
    Wang, Long
    Zhang, Xiaoping
    Wang, Yisong
    Sun, Wenhou
    ICC 2024 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2024, : 2694 - 2700