k-Nearest Neighbour Classifiers - A Tutorial

被引:445
作者
Cunningham, Padraig [1 ]
Delany, Sarah Jane [2 ]
机构
[1] Univ Coll Dublin, Sch Comp Sci, Dublin, Ireland
[2] Technol Univ Dublin, Sch Comp Sci, Dublin, Ireland
基金
爱尔兰科学基金会;
关键词
k-Nearest neighbour classifiers; BINARY SEARCH TREES; ALGORITHM; DISTANCE; INFORMATION; REDUCTION; SELECTION;
D O I
10.1145/3459665
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Perhaps the most straightforward classifier in the arsenal or Machine Learning techniques is the Nearest Neighbour Classifier-classification is achieved by identifying the nearest neighbours to a query example and using those neighbours to determine the class of the query. This approach to classification is of particular importance, because issues of poor runtime performance is not such a problem these days with the computational power that is available. This article presents an overview of techniques for Nearest Neighbour classification focusing on: mechanisms for assessing similarity (distance), computational issues in identifying nearest neighbours, and mechanisms for reducing the dimension of the data. This article is the second edition of a paper previously published as a technical report [16]. Sections on similarity measures for time-series, retrieval speedup, and intrinsic dimensionality have been added. An Appendix is included, providing access to Python code for the key methods.
引用
收藏
页数:25
相关论文
共 51 条
[1]   INSTANCE-BASED LEARNING ALGORITHMS [J].
AHA, DW ;
KIBLER, D ;
ALBERT, MK .
MACHINE LEARNING, 1991, 6 (01) :37-66
[2]   AN INTRODUCTION TO KERNEL AND NEAREST-NEIGHBOR NONPARAMETRIC REGRESSION [J].
ALTMAN, NS .
AMERICAN STATISTICIAN, 1992, 46 (03) :175-185
[3]   Fast nearest neighbor condensation for large data sets classification [J].
Angiulli, Fabrizio .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (11) :1450-1464
[4]  
[Anonymous], 2003, PROC ACM SIGMOD WORK, DOI DOI 10.1145/882082.882086
[5]  
[Anonymous], P 10 INT C MACHINE L
[6]  
Baek K, 2002, PROCEEDINGS OF THE 6TH JOINT CONFERENCE ON INFORMATION SCIENCES, P824
[7]   The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances [J].
Bagnall, Anthony ;
Lines, Jason ;
Bostrom, Aaron ;
Large, James ;
Keogh, Eamonn .
DATA MINING AND KNOWLEDGE DISCOVERY, 2017, 31 (03) :606-660
[8]   MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING [J].
BENTLEY, JL .
COMMUNICATIONS OF THE ACM, 1975, 18 (09) :509-517
[9]   MULTIDIMENSIONAL BINARY SEARCH TREES IN DATABASE APPLICATIONS [J].
BENTLEY, JL .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1979, 5 (04) :333-340
[10]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281