Large-scale k-means clustering with user-centric privacy-preservation

被引:0
作者
Jun Sakuma
Shigenobu Kobayashi
机构
[1] University of Tsukuba,Department of Computer Science
[2] Tokyo Institute of Technology,Department of Computational Intelligence and Systems Science
来源
Knowledge and Information Systems | 2010年 / 25卷
关键词
Privacy; Privacy-preserving data mining; Clustering; -means; Peer-to-peer;
D O I
暂无
中图分类号
学科分类号
摘要
A k-means clustering with a new privacy-preserving concept, user-centric privacy preservation, is presented. In this framework, users can conduct data mining using their private information by storing them in their local storage. After the computation, they obtain only the mining result without disclosing private information to others. In most cases, the number of parties that can join conventional privacy-preserving data mining has been assumed to be only two. In our framework, we assume large numbers of parties join the protocol; therefore, not only scalability but also asynchronism and fault-tolerance is important. Considering this, we propose a k-mean algorithm combined with a decentralized cryptographic protocol and a gossip-based protocol. The computational complexity is O(log n) with respect to the number of parties n, and experimental results show that our protocol is scalable even with one million parties.
引用
收藏
页码:253 / 279
页数:26
相关论文
共 12 条
[1]  
Evfimievski A(2004)Privacy preserving mining of association rules Inf Syst 29 343-364
[2]  
Jelasity M(2005)Gossip-based aggregation in large dynamic networks ACM Trans Comput Syst (TOCS) 23 219-252
[3]  
Jha S(2005)Privacy preserving clustering Lect Notes Comput Sci 3679 397-81
[4]  
Lin X(2005)Privacy-preserving clustering with distributed EM mixture modeling Knowl Inform Syst 8 68-206
[5]  
Lindell Y(2002)Privacy preserving data mining J Cryptol 15 177-140
[6]  
Pinkas B(1991)A threshold cryptosystem without a trusted party Eurocrypt 91 129-570
[7]  
Pedersen T(2002)k-Anonymity: a model for protecting privacy Int J Uncertain Fuzziness Knowl Based Syst 10 557-157
[8]  
Sweeney L(2009)A hybrid multi-group approach for privacy-preserving data mining Knowl Inform Syst 19 133-898
[9]  
Teng Z(2008)Privacy-preserving Naïve Bayes Classification VLDB J 17 879-178
[10]  
Du W(2008)Privacy-preserving SVM classification Knowl Inform Syst 14 161-undefined