A novel kNN algorithm with data-driven k parameter computation

被引：136

作者：

Zhang, Shichao ^{[1
]}

Cheng, Debo ^{[1
,2
]}

Deng, Zhenyun ^{[1
]}

Zong, Ming ^{[3
]}

Deng, Xuelian ^{[4
]}

机构：

[1] Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, Guilin, Guangxi, Peoples R China

[2] Univ South Australia, Informat Technol & Math Sci, Adelaide, SA, Australia

[3] Massey Univ, Inst Nat & Math Sci, Auckland, New Zealand

[4] Guangxi Univ Chinese Med, Coll Publ Hlth & Management, Nanning, Guangxi, Peoples R China

来源：

PATTERN RECOGNITION LETTERS | 2018年 / 109卷

关键词：

kNN method; kNN prediction; Parameter computation; REGRESSION; IMPUTATION; CLASSIFICATION; OPTIMIZATION; SELECTION;

D O I：

10.1016/j.patrec.2017.09.036

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper studies an example-driven k-parameter computation that identifies different k values for different test samples in kNN prediction applications, such as classification, regression and missing data imputation. This is carried out with reconstructing a sparse coefficient matrix between test samples and training data. In the reconstruction process, an l(1)-norm regularization is employed to generate an element-wise sparsity coefficient matrix, and an LPP (Locality Preserving Projection) regularization is adopted to keep the local structures of data for achieving the efficiency. Further, with the learnt k value, k NN approach is applied to classification, regression and missing data imputation. We experimentally evaluate the proposed approach with 20 real datasets, and show that our algorithm is much better than previous k NN algorithms in terms of data mining tasks, such as classification, regression and missing value imputation. (C) 2017 Elsevier B.V. All rights reserved.

引用

页码：44 / 54

页数：11

共 50 条

[1] Least-Squares Regression Based on Atanassov's Intuitionistic Fuzzy Inputs-Outputs and Atanassov's Intuitionistic Fuzzy Parameters
Arefi, Mohsen
Taheri, Seyed Mahmoud
[J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2015, 23 (04) : 1142 - 1154
[2] Belkin M, 2002, ADV NEUR IN, V14, P585
[3] k-Nearest Neighbour method in functional nonparametric regression
Burba, Florent
Ferraty, Frederic
Vieu, Philippe
[J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2009, 21 (04) : 453 - 469
[4] LIBSVM: A Library for Support Vector Machines
Chang, Chih-Chung
Lin, Chih-Jen
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5] Jackknife variance estimation for nearest-neighbor imputation
Chen, JH
Shao, J
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (453) : 260 - 269
[6] κ NN algorithm with data-driven k value
[J]. Zhang, Shichao, 1600, Springer Verlag (8933): : 499 - 512
[7] NEAREST NEIGHBOR PATTERN CLASSIFICATION
COVER, TM
HART, PE
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
[8] ON THE ALMOST EVERYWHERE CONVERGENCE OF NONPARAMETRIC REGRESSION FUNCTION ESTIMATES
DEVROYE, L
[J]. ANNALS OF STATISTICS, 1981, 9 (06) : 1310 - 1319
[9] Ferraty F., 2006, SPR S STAT
[10] On optimum choice of k in nearest neighbor classification
Ghosh, Anil K.
[J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (11) : 3113 - 3123

← 1 2 3 4 5 →