A customizable text classifier for text mining

被引:0
作者
Zhang, Yun-Liang [1 ]
Zhang, Quan [2 ]
机构
[1] Institute of Acoustics, Graduate School, Chinese Academy of Sciences
[2] Institute of Acoustics, Chinese Academy of Sciences
关键词
Nature Language Processing (NPL); Text categorization; Text mining;
D O I
10.2481/dsj.6.S904
中图分类号
学科分类号
摘要
Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.
引用
收藏
页码:S904 / S909
页数:5
相关论文
共 20 条
[11]  
Shi Y., Zhao Y., Comparison of text categorization algorithms, Wuhan University Journal of Nature Sciences, 9, 5, pp. 798-804, (2004)
[12]  
Song F., Gao L., Performance evaluation Metric for text classifiers, Computer Engineering 30(13), 107-109, (2004)
[13]  
Tang Y., Niu L., Et al., Automated text categorization, Journal of Guangxi Normal University, 19, 4, pp. 50-55, (2001)
[14]  
Wang M., Gao S., The System for Automatic Text categorization Based on Chinese Character Vector, Journal of the China Society for Scientific and Technical Information, 19, 6, pp. 644-649, (2000)
[15]  
Wang M., Wang Z., Et al., Rough set text categorization rule extraction based on CHI value, Computer Applications 25(5):1026-1028,1033, (2005)
[16]  
Wang T., Ye W., J. Huazhong University of Sci. & Tech. (Nature Science Edition) 32(4), pp. 59-60, (2004)
[17]  
Wei X., The Software Platform for Expanded Sentence Category Analysis Based on the HNC Theory, (2005)
[18]  
Yang Y., Peterson J., A Comparative Study on Feature Selection in Text categorization, Proceedings of the Fourteenth International Conference on Machine Learning, pp. 412-420, (1997)
[19]  
Zhang J., Li C., WordNet-based Concept Vector Space Model for Text Categorization, Computer Engineering and Applications, 42, 4, pp. 174-178, (2006)
[20]  
Zhou Q., Zhao M., Et al., Study on feature selection in Chinese text categorization, Journal of Chinese Information Processing, 18, 3, pp. 17-23, (2004)