Privacy-preserving Density-based Clustering

被引:10
|
作者
Bozdemir, Beyza [1 ]
Canard, Sebastien [2 ]
Ermis, Orhan [1 ]
Moellering, Helen [3 ]
Onen, Melek [1 ]
Schneider, Thomas [3 ]
机构
[1] EURECOM, Sophia Antipolis, France
[2] Orange Labs, Appl Crypto Grp, Caen, France
[3] Tech Univ Darmstadt, Darmstadt, Germany
基金
欧洲研究理事会; 欧盟地平线“2020”;
关键词
Private Machine Learning; Clustering; Secure Computation;
D O I
10.1145/3433210.3453104
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is an unsupervised machine learning technique that outputs clusters containing similar data items. In this work, we investigate privacy-preserving density-based clustering which is, for example, used in financial analytics and medical diagnosis. When (multiple) data owners collaborate or outsource the computation, privacy concerns arise. To address this problem, we design, implement, and evaluate the first practical and fully private density-based clustering scheme based on secure two-party computation. Our protocol privately executes the DBSCAN algorithm without disclosing any information (including the number and size of clusters). It can be used for private clustering between two parties as well as for private outsourcing of an arbitrary number of data owners to two non-colluding servers. Our implementation of the DBSCAN algorithm privately clusters data sets with 400 elements in 7 minutes on commodity hardware. Thereby, it flexibly determines the number of required clusters and is insensitive to outliers, while being only factor 19x slower than today's fastest private K-means protocol (Mohassel et al., PETS'20) which can only be used for specific data sets. We then show how to transfer our newly designed protocol to related clustering algorithms by introducing a private approximation of the TRACLUS algorithm for trajectory clustering which has interesting real-world applications like financial time series forecasts and the investigation of the spread of a disease like COVID-19.
引用
收藏
页码:658 / 671
页数:14
相关论文
共 50 条
  • [31] A comparison of clustering-based privacy-preserving collaborative filtering schemes
    Bilge, Alper
    Polat, Huseyin
    APPLIED SOFT COMPUTING, 2013, 13 (05) : 2478 - 2489
  • [32] Augmented Rotation-Based Transformation for Privacy-Preserving Data Clustering
    Hong, Dowon
    Mohaisen, Abedelaziz
    ETRI JOURNAL, 2010, 32 (03) : 351 - 361
  • [33] A Clustering-Based Privacy-Preserving Method for Uncertain Trajectory Data
    Cai, Zhou-Fu
    Yang, He-Xing
    Shuang, Wang
    Jian, Xu
    Wei, Wang-Ming
    Na, Wu-Li
    2014 IEEE 13TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM), 2014, : 1 - 8
  • [34] Privacy-preserving mechanism for mixed data clustering with local differential privacy
    Yuan, Liujie
    Zhang, Shaobo
    Zhu, Gengming
    Alinani, Karim
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (19):
  • [35] Privacy-preserving distributed clustering using generative models
    Merugu, S
    Ghosh, J
    THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 211 - 218
  • [36] Privacy-preserving mechanisms for k-modes clustering
    Huu Hiep Nguyen
    COMPUTERS & SECURITY, 2018, 78 : 60 - 75
  • [37] Privacy-preserving clustering with distributed EM mixture modeling
    Lin, XD
    Clifton, C
    Zhu, M
    KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 8 (01) : 68 - 81
  • [38] A privacy-preserving data publishing algorithm for clustering application
    Chong, Zhihong
    Ni, Weiwei
    Liu, Tengteng
    Zhang, Yong
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2010, 47 (12): : 2083 - 2089
  • [39] Density Peak Clustering Algorithm Based on Differential Privacy Preserving
    Chen, Yun
    Du, Yunlan
    Cao, Xiaomei
    SCIENCE OF CYBER SECURITY, SCISEC 2019, 2019, 11933 : 20 - 32
  • [40] Clustering-oriented privacy-preserving data publishing
    Ni, Weiwei
    Chong, Zhihong
    KNOWLEDGE-BASED SYSTEMS, 2012, 35 : 264 - 270