Implementation and Analysis of Centroid Displacement-Based k-Nearest Neighbors

被引:12
作者
Wang, Alex X. [1 ]
Chukova, Stefanka S. [1 ]
Nguyen, Binh P. [1 ]
机构
[1] Victoria Univ Wellington, Sch Math & Stat, Wellington, New Zealand
来源
ADVANCED DATA MINING AND APPLICATIONS (ADMA 2022), PT I | 2022年 / 13725卷
关键词
Similarity; Nearest neighbors; Centroid displacement; k-NN; Scikit-learn; RECOGNITION;
D O I
10.1007/978-3-031-22064-7_31
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
k-NN is a widely used supervised machine learning method in different domains. Despite its simplicity, effectiveness, and robustness, k-NN is limited to the use of the Euclidean distance as the similarity metric, the arbitrarily selected neighborhood size k, the computational challenge from high dimensional data, and the use of the simple majority voting rule. Among different variants of k-NN in classification, we sought to address the last issue and proposed the Centroid Displacementbased k-NN (CDNN), where centroid displacement is used for class determination. In this study, we present an implementation of CDNN for scikit-learn, a well-known machine learning library for the Python programming language, and a comprehensive comparative performance analysis of CDNN with different variants of k-NN in scikit-learn. We open-source our algorithm to benefit the users, and to the best of our knowledge, no similar studies on performance analysis of k-NN and its variants in scikit-learn have been done. We also examine the effectiveness of different distance metrics on the performance of CDNN on different datasets. Extensive experiments on real-world and synthetic datasets verify the effectiveness of CDNN compared to the standard k-NN and other state-of-the-art k-NN-based algorithms. The results from the distance metrics comparison study also show that other distance metrics can further improve the classification performance of CDNN.
引用
收藏
页码:431 / 443
页数:13
相关论文
共 27 条
[1]   Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review [J].
Abu Alfeilat, Haneen Arafat ;
Hassanat, Ahmad B. A. ;
Lasassmeh, Omar ;
Tarawneh, Ahmad S. ;
Alhasanat, Mahmoud Bashir ;
Salman, Hamzeh S. Eyal ;
Prasath, V. B. Surya .
BIG DATA, 2019, 7 (04) :221-248
[2]   Efficient k-nearest neighbors search in graph space [J].
Abu-Aisheh, Zeina ;
Raveaux, Romain ;
Ramel, Jean-Yves .
PATTERN RECOGNITION LETTERS, 2020, 134 :77-86
[3]  
Bache K., 2013, UCI machine learning repository
[4]  
Bentley JL, 1975, Technical report
[5]  
Cha S., 2007, INT J MATH MODELS ME, V1, P1
[6]  
DUDANI SA, 1976, IEEE T SYST MAN CYB, V6, P327
[7]  
Elhamifar E., 2011, Adv. Neural Inf. Process. Syst., V24
[8]   A novel version of k nearest neighbor: Dependent nearest neighbor [J].
Ertugrul, Omer Faruk ;
Tagluk, Mehmet Emin .
APPLIED SOFT COMPUTING, 2017, 55 :480-490
[9]  
Fix E., 1951, 4 USAF SCH AVI MED
[10]  
Ho TK, 2002, IEEE T PATTERN ANAL, V24, P289, DOI 10.1109/34.990132