A weighted kernel possibilistic c-means algorithm based on cloud computing for clustering big data

被引:42
作者
Zhang, Qingchen [1 ]
Chen, Zhikui [1 ]
机构
[1] Dalian Univ Technol, Sch Software Technol, Liaoning, Peoples R China
关键词
big data; complex networks; cloud computing; PCM; NETWORK; COMMUNICATION;
D O I
10.1002/dac.2844
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Possibilistic c-means (PCM) cluster algorithm has emerged as an important tool for data preprocessing widely used in data mining and knowledge discovery. Owning to the huge amount of data, high computational complexity, and noise-corrupted data, the PCM algorithms scaled for big data find it difficult to produce a good result in real time. The paper proposes a weighted kernel PCM (wkPCM) algorithm to cluster data objects in appropriate groups. The proposed algorithm introduces weights to define the relative importance of each object in the kernel clustering solution, which reduces the corruption caused by noisy data. In order to improve the real time of the proposed algorithm, cloud computing technology is used to optimize wkPCM to propose a distributed wkPCM algorithm based on MapReduce, which can provide significant computation speed. Experiment demonstrates that the proposed possibilistic clustering algorithms can cluster big data in appropriate groups in real time. Copyright (c) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:1378 / 1391
页数:14
相关论文
共 41 条
  • [1] A View of Cloud Computing
    Armbrust, Michael
    Fox, Armando
    Griffith, Rean
    Joseph, Anthony D.
    Katz, Randy
    Konwinski, Andy
    Lee, Gunho
    Patterson, David
    Rabkin, Ariel
    Stoica, Ion
    Zaharia, Matei
    [J]. COMMUNICATIONS OF THE ACM, 2010, 53 (04) : 50 - 58
  • [2] Scalable K-Means++
    Bahmani, Bahman
    Moseley, Benjamin
    Vattani, Andrea
    Kumar, Ravi
    Vassilvitskii, Sergei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (07): : 622 - 633
  • [3] A possibilistic approach to clustering - Comments
    Barni, M
    Cappellini, V
    Mecocci, A
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1996, 4 (03) : 393 - 396
  • [4] Social Network Analysis and Mining for Business Applications
    Bonchi, Francesco
    Castillo, Carlos
    Gionis, Aristides
    Jaimes, Alejandro
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [5] Cheng Y., 2012, Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, P697, DOI DOI 10.1145/2213836.2213936
  • [6] Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
  • [7] Ene A., 2011, P 17 ACM KDD, P681, DOI DOI 10.1145/2020408.2020515
  • [8] The Internet of Things: The Next Technological Revolution
    Feki, Mohamed Ali
    Kawsar, Fahim
    Boussard, Mathieu
    Trappeniers, Lieven
    [J]. COMPUTER, 2013, 46 (02) : 24 - 25
  • [9] Ferreira Cordeiro RobsonLeonardo., 2011, P 17 ACM SPECIAL INT, P690, DOI DOI 10.1145/2020408.2020516
  • [10] Applying the Possibilistic c-Means Algorithm in Kernel-Induced Spaces
    Filippone, Maurizio
    Masulli, Francesco
    Rovetta, Stefano
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2010, 18 (03) : 572 - 584