ClusterMPP: An unsupervised density-based clustering algorithm via Marked Point Process

被引:2
作者
Henni, Khadidja [1 ]
Alata, Olivier [2 ]
Zaoui, Lynda [1 ]
Vannier, Brigitte [3 ]
El Idrissi, Abdellatif [4 ]
Moussa, Ahmed [5 ]
机构
[1] Univ Sci & Technol, Dept Comp Sci, LSSD Lab, Oran, Algeria
[2] Jean Monnet Univ, CNRS, UMR 5516, Hubert Curien Lab, St Etienne, France
[3] Poitiers Univ, Receptors Regulat & Tumor Cells Lab, Poitiers, France
[4] Abdelmalek Essaadi Univ, ENSA Tangier, Tangier, Morocco
[5] Abdelmalek Essaadi Univ, ENSA Tangier, Syst & Data Engn Team, Tangier, Morocco
关键词
Unsupervised learning; density-based clustering; mode detection; Marked Point Process; non-parametric; multidimensional data; overlapping clusters; BIG DATA; EXTRACTION; MODEL;
D O I
10.3233/IDA-160010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Conventional clustering algorithms optimize a single criterion, which may not conform to diverse needs of multidimensional data science. This paper proposes a new clustering algorithm that solves multiple clustering issues, called clustering by Marked Point Process (ClusterMPP). It is a new, efficient, scalable and unsupervised density-based clustering algorithm. ClusterMPP simulates a proposed Marked Point Process (MPP) to find clusters of complex shapes present in the raw data space. The outputs of this new algorithm, in the first step, are the observations belonging to each cluster mode called "prototypes". The classification process is achieved, in the second step, using an improved KNN algorithm. We conduct intensive experiments to compare ClusterMPP with the most well-known algorithms. The results of ClusterMPP proved its efficiency on high complex and scalable datasets.
引用
收藏
页码:827 / 847
页数:21
相关论文
共 46 条
  • [1] Big Data, Data Science, and Analytics: The Opportunity and Challenge for IS Research
    Agarwal, Ritu
    Dhar, Vasant
    [J]. INFORMATION SYSTEMS RESEARCH, 2014, 25 (03) : 443 - 448
  • [2] Grouping/degrouping point process, a point process driven by geometrical and topological properties of a partition in regions
    Alata, O.
    Burg, S.
    Dupas, A.
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2011, 115 (09) : 1324 - 1339
  • [3] [Anonymous], 2012, INT J SOFT COMPUTING
  • [4] Building Development Monitoring in Multitemporal Remotely Sensed Image Pairs with Stochastic Birth-Death Dynamics
    Benedek, Csaba
    Descombes, Xavier
    Zerubia, Josiane
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 33 - 50
  • [5] Detection of soldering defects in Printed Circuit Boards with Hierarchical Marked Point Processes
    Benedek, Csaba
    [J]. PATTERN RECOGNITION LETTERS, 2011, 32 (13) : 1535 - 1543
  • [6] Bohm C., 2004, IEEE ICDM
  • [7] Model-based clustering of high-dimensional data: A review
    Bouveyron, Charles
    Brunet-Saumard, Camille
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 52 - 78
  • [8] Enhancing density-based clustering: Parameter reduction and outlier detection
    Cassisi, Carmelo
    Ferro, Alfredo
    Giugno, Rosalba
    Pigola, Giuseppe
    Pulvirenti, Alfredo
    [J]. INFORMATION SYSTEMS, 2013, 38 (03) : 317 - 330
  • [9] On connected component Markov point processes
    Chin, YC
    Baddeley, AJ
    [J]. ADVANCES IN APPLIED PROBABILITY, 1999, 31 (02) : 279 - 282
  • [10] Markov interacting component processes
    Chin, YC
    Baddeley, AJ
    [J]. ADVANCES IN APPLIED PROBABILITY, 2000, 32 (03) : 597 - 619