A graph-based semi-supervised k nearest-neighbor method for nonlinear manifold distributed data classification

被引:29
作者
Tu, Enmei [1 ]
Zhang, Yaqian [2 ]
Zhu, Lin [3 ]
Yang, Jie [4 ]
Kasabov, Nikola [5 ]
机构
[1] Nanyang Technol Univ, Rolls Royce NTU Corp Lab, Singapore, Singapore
[2] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore, Singapore
[3] Shanghai Univ Elect Power, Sch Comp Sci & Technol, Shanghai, Peoples R China
[4] Shanghai Jiao Tong Univ, Inst Image Proc & Pattern Recognit, Shanghai 200030, Peoples R China
[5] Auckland Univ Technol, Knowledge Engn & Discovery Res Inst, Auckland, New Zealand
关键词
k Nearest neighbors; Manifold classification; Constrained tired random walk; Semi-Supervised learning; DIMENSIONALITY REDUCTION; ALGORITHMS;
D O I
10.1016/j.ins.2016.07.016
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
k nearest neighbors (kNN) is one of the most widely used supervised learning algorithms to classify Gaussian distributed data, but it does not achieve good results when it is applied to nonlinear manifold distributed data, especially when a very limited amount of labeled samples are available. In this paper, we propose a new graph-based kNN algorithm which can effectively handle both Gaussian distributed data and nonlinear manifold distributed data. To achieve this goal, we first propose a constrained Tired Random Walk (TRW) by constructing an R-level nearest-neighbor strengthened tree over the graph, and then compute a TRW matrix for similarity measurement purposes. After this, the nearest neighbors are identified according to the TRW matrix and the class label of a query point is determined by the sum of all the TRW weights of its nearest neighbors. To deal with online situations, we also propose a new algorithm to handle sequential samples based a local neighborhood reconstruction. Comparison experiments are conducted on both synthetic data sets and real-world data sets to demonstrate the validity of the proposed new kNN algorithm and its improvements to other version of kNN algorithms. Given the widespread appearance of manifold structures in real-world problems and the popularity of the traditional kNN algorithm, the proposed manifold version kNN shows promising potential for classifying manifold-distributed data. (C) 2016 Elsevier Inc. All rights reserved.
引用
收藏
页码:673 / 688
页数:16
相关论文
共 48 条
[1]   Dimensional reduction in constrained global optimization on smooth manifolds [J].
Aguiar e O, Hime, Jr. ;
Petraglia, Antonio .
INFORMATION SCIENCES, 2015, 299 :243-261
[2]  
[Anonymous], 2010, Adv. Neural Inf. Proces. Syst.
[3]  
[Anonymous], 2010, P 27 INT C MACH LEAR
[4]  
[Anonymous], 2012, MATRIX COMPUTATIONS
[5]  
[Anonymous], 2004, DISCUSSION PAPER 399
[6]  
[Anonymous], 2003, P 20 INT C MACH LEAR
[7]  
[Anonymous], 2004, P 2004 VLDB C, DOI DOI 10.1016/B978-012088469-8.50074-7
[8]  
[Anonymous], 2003, P 20 INT C MACH LEAR, DOI DOI 10.1145/2612669.2612699
[9]  
[Anonymous], 2010, Advances in Neural Information Processing Systems
[10]  
Bengio Y., 2005, P ADV NEURAL INFORM, V18, P115