Nearest neighbour group-based classification

被引:22
作者
Samsudin, Noor A. [1 ]
Bradley, Andrew P. [1 ]
机构
[1] Univ Queensland, Sch Informat Technol & Elect Engn, St Lucia, Qld 4072, Australia
关键词
Group-based classification; Nearest neighbour; Compound classification; PATTERN-RECOGNITION;
D O I
10.1016/j.patcog.2010.05.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The purpose of group-based classification (GBC) is to determine the class label for a set of test samples, utilising the prior knowledge that the samples belong to same, but unknown class. This can be seen as a simplification of the well studied, but computationally complex, non-sequential compound classification problem. In this paper, we extend three variants of the nearest neighbour algorithm to develop a number of non-parametric group-based classification techniques. The performances of the proposed techniques are then evaluated on both synthetic and real-world data sets and their performance compared with techniques that label test samples individually. The results show that, while no one algorithm clearly outperforms all others on all data sets, the proposed group-based classification techniques have the potential to outperform the individual-based techniques, especially as the (group) size of the test set increases. In addition, it is shown that algorithms that pool information from the whole test set perform better than two-stage approaches that undertake a vote based on the class labels of individual test samples. (C) 2010 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3458 / 3467
页数:10
相关论文
共 37 条
[1]  
ALPAYDIN E, 2004, ASSESSING COMP CLASS
[2]  
[Anonymous], 2011, Pei. data mining concepts and techniques
[3]  
[Anonymous], 2000, Pattern Classification
[4]   An optimal algorithm for approximate nearest neighbor searching in fixed dimensions [J].
Arya, S ;
Mount, DM ;
Netanyahu, NS ;
Silverman, R ;
Wu, AY .
JOURNAL OF THE ACM, 1998, 45 (06) :891-923
[5]  
BAILEY T, 1978, IEEE T SYST MAN CYB, V8, P311
[6]   MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING [J].
BENTLEY, JL .
COMMUNICATIONS OF THE ACM, 1975, 18 (09) :509-517
[7]   ROC curves and the (2)(X) test [J].
Bradley, AP .
PATTERN RECOGNITION LETTERS, 1996, 17 (03) :287-294
[8]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[9]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[10]  
Dasarathy B.V., 1991, Nearest Neighbor Norms: NN Pattern Classification Techniques