On Application of a Probabilistic K-Nearest Neighbors Model for Cluster Validation Problem

被引:2
|
作者
Volkovich, Zeev [1 ]
Barzily, Zeev [1 ]
Avros, Renata [1 ]
Toledano-Kitai, Dvora [1 ]
机构
[1] ORT Braude Coll Engn, Software Engn Dept, IL-21982 Karmiel, Israel
关键词
Clustering; Cluster stability; Data mining; K-Nearest neighbors; RESAMPLING METHOD; NUMBER;
D O I
10.1080/03610926.2011.562786
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
K-Nearest Neighbors is a widely used technique for classifying and clustering data. In the current article, we address the cluster stability problem based upon probabilistic characteristics of this approach. We estimate the stability of partitions obtained from clustering pairs of samples. Partitions are presumed to be consistent if their clusters are stable. Clusters validity is quantified through the amount of K-Nearest Neighbors belonging to the point's sample. The null-hypothesis, of the well-mixed samples within the clusters, suggests Binomial Distribution of this quantity with K trials and the success probability 0.5. A cluster is represented by a summarizing index, of the p-values calculated over all cluster objects, under the null hypothesis for the alternative, and the partition quality is evaluated via the worst partition cluster. The true number of clusters is attained by the empirical index distribution having maximal suitable asymmetry. The proposed methodology offers to produce the index distributions sequentially and to assess their asymmetry. Numerical experiments exhibit a good capability of the methodology to expose the true number of clusters.
引用
收藏
页码:2997 / 3010
页数:14
相关论文
共 50 条
  • [21] Work in Progress: K-Nearest Neighbors Techniques for ABAC Policies Clustering
    Benkaouz, Yahya
    Erradi, Mohammed
    Freisleben, Bernd
    ABAC'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL WORKSHOP ON ATTRIBUTE BASED ACCESS CONTROL, 2016, : 72 - 75
  • [22] An efficient clustering algorithm based on the k-nearest neighbors with an indexing ratio
    Raneem Qaddoura
    Hossam Faris
    Ibrahim Aljarah
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 675 - 714
  • [23] Solar Forecasting by K-Nearest Neighbors Method with Weather Classification and Physical Model
    Liu, Zhao
    Zhang, Ziang
    2016 NORTH AMERICAN POWER SYMPOSIUM (NAPS), 2016,
  • [24] A Class-Cluster k-Nearest Neighbors Method for Temporal In-Trouble Student Identification
    Chau Vo
    Hua Phung Nguyen
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2019, PT I, 2019, 11431 : 219 - 230
  • [25] K-nearest neighbors and a kernel density estimator for GEFCom2014 probabilistic wind power forecasting
    Zhang, Yao
    Wang, Jianxue
    INTERNATIONAL JOURNAL OF FORECASTING, 2016, 32 (03) : 1074 - 1080
  • [26] On the Use of Weighted k-Nearest Neighbors for Missing Value Imputation
    Lim, Chanhui
    Kim, Dongjae
    KOREAN JOURNAL OF APPLIED STATISTICS, 2015, 28 (01) : 23 - 31
  • [27] A Placement Prediction System Using K-Nearest Neighbors Classifier
    Giri, Animesh
    Bhagavath, M. Vignesh V.
    Pruthvi, Bysani
    Dubey, Naini
    2016 SECOND INTERNATIONAL CONFERENCE ON COGNITIVE COMPUTING AND INFORMATION PROCESSING (CCIP), 2016,
  • [28] An efficient algorithm to find k-nearest neighbors in flocking behavior
    Lee, Jae Moon
    INFORMATION PROCESSING LETTERS, 2010, 110 (14-15) : 576 - 579
  • [29] Evolutionary Optimization on k-Nearest Neighbors Classifier for Imbalanced Datasets
    Shih, Yu-Hsin
    Ting, Chuan-Kang
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 3348 - 3355
  • [30] RSSI-based Localization Using K-Nearest Neighbors
    Achroufene, Achour
    AD HOC & SENSOR WIRELESS NETWORKS, 2023, 56 (1-2) : 105 - 135