A Proposal for Local k Values for k-Nearest Neighbor Rule

被引:79
作者
Garcia-Pedrajas, Nicolas [1 ]
Romero del Castillo, Juan A. [1 ]
Cerruela-Garcia, Gonzalo [1 ]
机构
[1] Univ Cordoba, Dept Comp & Numer Anal, E-14004 Cordoba, Spain
关键词
Classification; class-imbalanced data sets; k-nearest neighbors (k-NN); RECOGNITION;
D O I
10.1109/TNNLS.2015.2506821
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The k-nearest neighbor (k-NN) classifier is one of the most widely used methods of classification due to several interesting features, including good generalization and easy implementation. Although simple, it is usually able to match and even outperform more sophisticated and complex methods. One of the problems with this approach is fixing the appropriate value of k. Although a good value might be obtained using cross validation, it is unlikely that the same value could be optimal for the whole space spanned by the training set. It is evident that different regions of the feature space would require different values of k due to the different distributions of prototypes. The situation of a query instance in the center of a class is very different from the situation of a query instance near the boundary between two classes. In this brief, we present a simple yet powerful approach to setting a local value of k. We associate a potentially different k to every prototype and obtain the best value of k by optimizing a criterion consisting of the local and global effects of the different k values in the neighborhood of the prototype. The proposed method has a fast training stage and the same complexity as the standard k-NN approach at the testing stage. The experiments show that this simple approach can significantly outperform the standard k-NN rule for both standard and class-imbalanced problems in a large set of different problems.
引用
收藏
页码:470 / 475
页数:6
相关论文
共 13 条
  • [1] [Anonymous], P ADV NEUR INF PROC
  • [2] Shape matching and object recognition using shape contexts
    Belongie, S
    Malik, J
    Puzicha, J
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (04) : 509 - 522
  • [3] A lot of randomness is hiding in accuracy
    Ben-David, Arle
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (07) : 875 - 885
  • [4] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [5] Ferrer-Troyano F. J., 2001, P 10 PORT C ART INT, V2258, P22
  • [6] A novel two-level nearest neighbor classification algorithm using an adaptive distance metric
    Gao, Yunlong
    Pan, Jinyan
    Ji, Guoli
    Yang, Zijiang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2012, 26 : 103 - 110
  • [7] LICHMAN M., 2013, UCI MACHINE LEARNING
  • [8] A simple locally adaptive nearest neighbor rule with application to pollution forecasting
    Nock, R
    Sebban, M
    Bernard, D
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2003, 17 (08) : 1369 - 1382
  • [9] Shakhnarovich G, 2003, NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, P750
  • [10] Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence
    Wang, JG
    Neskovic, P
    Cooper, LN
    [J]. PATTERN RECOGNITION, 2006, 39 (03) : 417 - 423