K-Nearest Neighbor Algorithm Optimization in Text Categorization

被引:14
作者
Chen, Shufeng [1 ]
机构
[1] Univ Sci & Technol China, Res Inst Elect Sci & Technol, Chengdu 611730, Sichuan, Peoples R China
来源
2017 3RD INTERNATIONAL CONFERENCE ON ENVIRONMENTAL SCIENCE AND MATERIAL APPLICATION (ESMA2017), VOLS 1-4 | 2018年 / 108卷
关键词
D O I
10.1088/1755-1315/108/5/052074
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
K-Nearest Neighbor (KNN) classification algorithm is one of the simplest methods of data mining. It has been widely used in classification, regression and pattern recognition. The traditional KNN method has some shortcomings such as large amount of sample computation and strong dependence on the sample library capacity. In this paper, a method of representative sample optimization based on CURE algorithm is proposed. On the basis of this, presenting a quick algorithm QKNN (Quick k-nearest neighbor) to find the nearest k neighbor samples, which greatly reduces the similarity calculation. The experimental results show that this algorithm can effectively reduce the number of samples and speed up the search for the k nearest neighbor samples to improve the performance of the algorithm.
引用
收藏
页数:5
相关论文
共 7 条
[1]  
Arjen P.de Vries., 2002, SIGMOD '02: Proceedings of the 2002 ACM SIGMOD international conference on Management of data, P322
[2]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[3]  
Han J., 2012, Data Mining, P393, DOI [DOI 10.1016/B978-0-12-381479-1.00009-5, 10.1016/B978-0-12-381479-1.00009-5]
[4]  
[李荣陆 Li Ronglu], 2004, [计算机研究与发展, Journal of Computer Research and Development], V41, P539
[5]  
YANG JL, 2004, INTELLIGENCE J, P137
[6]  
Yu Wang, 2005, COMPUTER APPL, V25, P634
[7]  
Zhang Xiao-hui, 2003, Journal of Northeastern University (Natural Science), V24, P229