A parameter-free hybrid instance selection algorithm based on local sets with natural neighbors

被引:22
作者
Li, Junnan [1 ]
Zhu, Qingsheng [1 ]
Wu, Quanwang [1 ]
机构
[1] Chongqing Univ, Chongqing Key Lab Software Theory & Technol, Coll Comp Sci, Chongqing 400044, Peoples R China
基金
中国国家自然科学基金;
关键词
Instance selection; K nearest neighbor; Local sets; Natural neighbors; Parameter-free; NEAREST-NEIGHBOR; CLUSTERING-ALGORITHM; REDUCTION ALGORITHM; SIZE;
D O I
10.1007/s10489-019-01598-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Instance selection aims to search for the best patterns in the training set and main instance selection methods include condensation methods, edition methods and hybrid methods. Hybrid methods combine advantages of both edition methods and condensation methods. Nevertheless, most of existing hybrid approaches heavily rely on parameters and are relatively time-consuming, resulting in the performance instability and application difficulty. Though several relatively fast and (or) parameter-free hybrid methods are proposed, they still have the difficulty in achieving both high accuracy and high reduction. In order to solve these problems, we present a new parameter-free hybrid instance selection algorithm based on local sets with natural neighbors (LSNaNIS). A new parameter-free definition for the local set is first proposed based on the fast search for natural neighbors. The new local set can fast and reasonably describe local characteristics of data. In LSNaNIS, we use the new local set to design an edition method (LSEdit) to remove harmful samples, a border method (LSBorder) to retain representative border samples and a core method (LSCore) to condense internal samples. Comparison experiments show that LSNaNIS is relatively fast and outperforms existing hybrid methods in improving the k-nearest neighbor in terms of both accuracy and reduction.
引用
收藏
页码:1527 / 1541
页数:15
相关论文
共 45 条
[1]   Local sets for multi-label instance selection [J].
Arnaiz-Gonzalez, Alvar ;
Diez-Pastor, Jose-Francisco ;
Rodriguez, Juan J. ;
Garcia-Osorio, Cesar .
APPLIED SOFT COMPUTING, 2018, 68 :651-666
[2]   MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING [J].
BENTLEY, JL .
COMMUNICATIONS OF THE ACM, 1975, 18 (09) :509-517
[3]   Advances in instance selection for instance-based learning algorithms [J].
Brighton, H ;
Mellish, C .
DATA MINING AND KNOWLEDGE DISCOVERY, 2002, 6 (02) :153-172
[4]   Combining instance selection methods based on data characterization: An approach to increase their effectiveness [J].
Caises, Yoel ;
Gonzalez, Antonio ;
Leyva, Enrique ;
Perez, Raul .
INFORMATION SCIENCES, 2011, 181 (20) :4780-4798
[5]   Prototype selection to improve monotonic nearest neighbor [J].
Cano, Jose-Ramon ;
Aljohani, Naif R. ;
Abbasi, Rabeeh Ayaz ;
Alowidbi, Jalal S. ;
Garcia, Salvador .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2017, 60 :128-135
[6]   ATISA: Adaptive Threshold-based Instance Selection Algorithm [J].
Cavalcanti, George D. C. ;
Ren, Tsang Ing ;
Pereira, Cesar Lima .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (17) :6894-6900
[7]   Evolutionary feature and instance selection for traffic sign recognition [J].
Chen, Zong-Yao ;
Lin, Wei-Chao ;
Ke, Shih-Wen ;
Tsai, Chih-Fong .
COMPUTERS IN INDUSTRY, 2015, 74 :201-211
[8]   A local cores-based hierarchical clustering algorithm for data sets with complex structures [J].
Cheng, Dongdong ;
Zhu, Qingsheng ;
Huang, Jinlong ;
Wu, Quanwang ;
Yang, Lijun .
NEURAL COMPUTING & APPLICATIONS, 2019, 31 (11) :8051-8068
[9]   Natural neighbor-based clustering algorithm with local representatives [J].
Cheng, Dongdong ;
Zhu, Qingsheng ;
Huang, Jinlong ;
Yang, Lijun ;
Wu, Quanwang .
KNOWLEDGE-BASED SYSTEMS, 2017, 123 :238-253
[10]   Multi-label learning of non-equilibrium labels completion with mean shift [J].
Cheng Yusheng ;
Zhao Dawei ;
Zhan Wenfa ;
Wang Yibin .
NEUROCOMPUTING, 2018, 321 :92-102