共 11 条
An improved kNN text classification method
被引:20
作者:
Wang, Fengfei
[1
]
Liu, Zhen
[1
,2
]
Wang, Chundong
[1
]
机构:
[1] Tianjin Univ Technol, Grad Sch Comp & Commun Engn, Tianjin, Peoples R China
[2] Nagasaki Inst Appl Sci, Grad Sch Engn, 536 Aba Machi, Nagasaki 8510193, Japan
关键词:
text classification;
k-nearest neighbours;
kNN;
self-organising map;
SOM;
neural network;
computer science;
engineering;
D O I:
10.1504/IJCSE.2019.103944
中图分类号:
TP39 [计算机的应用];
学科分类号:
081203 ;
0835 ;
摘要:
This paper proposes an improved kNN text classification method. The kNN algorithm in vector space models (VSM) has several limitations, such as occupying excessive storage space and all dimensions in the kNN algorithm share the same weight, making classification inaccurate. To solve these problems, this paper proposes a SOM neural network with principal component weighting. In this model, the principal component analysis process is embedded into the SOM neural network. Specifically, principal component analysis is used to extract the main feature components of the assessed target. Then, it is inputted into the network for computation. Meanwhile, variance contribution rates of principal components are introduced into the Euclidean distance function in the forms of weights. Using the principal component weighting SOM algorithm to compute the weights of VSM dimensions together with the kNN algorithm could effectively reduce dimensions of a vector space, and increase the precision and speed of the kkNN text classification method.
引用
收藏
页码:397 / 403
页数:7
相关论文