InstanceRank based on borders for instance selection

被引:24
作者
Hernandez-Leal, Pablo [1 ]
Ariel Carrasco-Ochoa, J. [1 ]
Fco Martinez-Trinidad, J. [1 ]
Arturo Olvera-Lopez, J. [2 ]
机构
[1] Natl Inst Astrophys Opt & Elect, Dept Comp Sci, Puebla 72840, Mexico
[2] Benemerita Univ Autonoma Puebla, Dept Comp Sci, Puebla 72570, Mexico
关键词
Instance selection; Instance ranking; Border instances; Supervised classification; NEAREST-NEIGHBOR; ALGORITHM;
D O I
10.1016/j.patcog.2012.07.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Instance selection algorithms are used for reducing the number of training instances. However, most of them suffer from long runtimes which results in the incapability to be used with large datasets. In this work, we introduce an Instance Ranking per class using Borders (instances near to instances belonging to different classes), using this ranking we propose an instance selection algorithm (IRB). We evaluated the proposed algorithm using k-NN with small and large datasets, comparing it against state of the art instance selection algorithms. In our experiments, for large datasets IRB has the best compromise between time and accuracy. We also tested our algorithm using SVM, LWLR and C4.5 classifiers, in all cases the selection computed by our algorithm obtained the best accuracies in average. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:365 / 375
页数:11
相关论文
共 46 条
[21]   CONDENSED NEAREST NEIGHBOR RULE [J].
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1968, 14 (03) :515-+
[22]   Kernel methods for short-term portfolio management [J].
Ince, H ;
Trafalis, TB .
EXPERT SYSTEMS WITH APPLICATIONS, 2006, 30 (03) :535-542
[23]  
Jankowski N, 2004, LECT NOTES ARTIF INT, V3070, P598
[25]  
Lin J., 2002, IEEE T INFORM THEORY, V37, P145
[26]   On issues of instance selection [J].
Liu, H ;
Motoda, H .
DATA MINING AND KNOWLEDGE DISCOVERY, 2002, 6 (02) :115-130
[27]  
Liu H., 2001, INSTANCE SELECTION C, V608
[28]   Ensemble gene selection for cancer classification [J].
Liu, Huawen ;
Liu, Lei ;
Zhang, Huijie .
PATTERN RECOGNITION, 2010, 43 (08) :2763-2772
[29]   A clustering method for automatic biometric template selection [J].
Lumini, A ;
Nanni, L .
PATTERN RECOGNITION, 2006, 39 (03) :495-497
[30]   Class Conditional Nearest Neighbor for Large Margin Instance Selection [J].
Marchiori, Elena .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (02) :364-370