Classification technique of chinese agricultural text information based on SVM

被引:0
作者
College of Information and Electrical Engineering, China Agricultural University, Beijing [1 ]
100083, China
不详 [2 ]
100097, China
机构
[1] College of Information and Electrical Engineering, China Agricultural University, Beijing
[2] Beijing Engineering Research Center of Agricultural Internet of Things, Beijing
来源
Nongye Jixie Xuebao | / 174-179期
关键词
Chinese agricultural information; Information integration; Support vector machine; Text classification;
D O I
10.6041/j.issn.1000-1298.2015.S0.029
中图分类号
学科分类号
摘要
In order to provide personalized services for agricultural information recommendation, it was needed to organize and classify information efficiently. According to the characteristics of agricultural texts, a Chinese agricultural text classification model was proposed based on linear support vector machine (SVM). Firstly, an agriculture-domain-based dictionary was built. Secondly, a feature vector was extracted and the weight for each feature in a vector was selected. Lastly, a text classification model was established. The model was tested on 1071 documents which were belonged to four classes: planting, forestry, animal husbandry and fisheries. The results showed that the accuracy was 96.5% and the recall rate was 96.4%. Both of their performances were higher than those of the ones using other classification methods, such as the Bayesian, decision tree, KNN, SMO algorithm and neural network. The model was applied to the platform for agricultural internet of things (IOT) industry integrated information service. The performance showed that the method can automatically classify Chinese agricultural text information and the response time met the system requirements. © 2015, Chinese Society for Agricultural Machinery. All right reserved.
引用
收藏
页码:174 / 179
页数:5
相关论文
共 11 条
  • [1] Le Q., Mikolov T., Distributed representations of sentences and documents, Proceedings of the 31st International Conference on Machine Learning, (2014)
  • [2] Djuric N., Radosavljevic V., Grbovic M., Hierarchical neural language models for joint representation of streaming documents and their content, International World Wide Web Conference Committee (IW3C2), pp. 248-255, (2015)
  • [3] Robertson S., Understanding inverse document frequency: on theoretical arguments for IDF, Journal of Documentation, 60, 5, pp. 503-520, (2004)
  • [4] Patel K.J., Sarvakar K.J., Web page classification using data mining, International Journal of Advanced Research in Computer and Communication Engineering, 7, pp. 2513-2519, (2013)
  • [5] Kenekayoro P., Buckley K., Automatic classification of academic web page types, Scientometrics, 101, 2, pp. 1015-1026, (2014)
  • [6] Miltsakaki E., Troutt A., Real-time web text classification and analysis of reading difficulty, Proceedings of the Third ACL Workshop on Innovative Use of NLP for Building Educational Applications, pp. 89-97, (2008)
  • [7] Shen F., Luo X., Chen Y., Text classification dimension reduction algorithm for Chinese web page based on deep learning, International Conference on Cyberspace Technology (CCT 2013), pp. 451-456, (2013)
  • [8] Revathi N., Anjana P., Jagadeesh K., Web text classification using genetic algorithm and a dynamic neural network model, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 2, 2, pp. 436-442, (2013)
  • [9] Ertekin S., Lee Giles C., A Comparative Study on Representation of Web Pages in Automatic Text Categorization, (2010)
  • [10] Duan Y., Zhang T., Research of on technology Chinese agricultural web page classification based on vector space model, Journal of Xinjiang Agricultral University, 35, 2, pp. 164-167, (2012)