On optimum choice of k in nearest neighbor classification

被引：87

作者：

Ghosh, Anil K. ^{[1
]}

机构：

[1] Indian Stat Inst, Theoret Stat & Math Unit, Kolkata 700108, India

来源：

COMPUTATIONAL STATISTICS & DATA ANALYSIS | 2006年 / 50卷 / 11期

关键词：

accuracy index; Bayesian strength function; cross-validation; misclassification rate; neighborhood parameter; non-informative prior; optimal Bayes risk; posterior probability;

D O I：

10.1016/j.csda.2005.06.007

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

A major issue in k-nearest neighbor classification is how to choose the optimum value of the neighborhood parameter k. Popular cross-validation techniques often fail to guide us well in selecting k mainly due to the presence of multiple minimizers of the estimated misclassification rate. This article investigates a Bayesian method in this connection, which solves the problem of multiple optimizers. The utility of the proposed method is illustrated using some benchmark data sets. (C) 2005 Elsevier B.V. All rights reserved.

引用

页码：3113 / 3123

页数：11

共 24 条

[1]

Aho A.V., 1974, The Design and Analysis of Computer Algorithms

[2]

Anderson TW., 1984, INTRO MULTIVARIATE S

[3] NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].

COVER, TM ;

HART, PE .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+

[4]

Dasarathy B.V., 1991, IEEE COMPUTER SOC TU

[5]

Duda R. O., 2000, PATTERN CLASSIFICATI

[6]

Fix E., 1951, TECHNICAL REPORT REP, P261

[7]

Friedman J., 2001, ELEMENTS STAT LEARNI, V1

[8]

Friedman J., 1996, Another approach to polychotomous classification

[9]

Friedman JeromeH., 1994, FLEXIBLE METRIC NEAR

[10]

Ghosh AK, 2004, STAT SINICA, V14, P457

← 1 2 3 →