Research of Text Categorization on WEKA

被引:5
|
作者
Li Dan [1 ]
Liu Lihua [1 ]
Zhang Zhaoxin [1 ]
机构
[1] Hebei Software Inst, Baoding 071000, Hebei, Peoples R China
来源
2013 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM DESIGN AND ENGINEERING APPLICATIONS (ISDEA) | 2013年
关键词
Text categorization; Naive bayes; Decision tree; Support vector machines; Weka;
D O I
10.1109/ISDEA.2012.266
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The choice of algorithm is a key text categorization problem. In order to evaluation synthetically, analyzed three popular text categorization algorithm that are naive Bayes (NB), decision tree(DT) and support vector machines(SVM). Carried on simulation experiment used the open source data mining tool of Weka. Experimental results show some significant conclusions: The performance of three classification methods are better, including Support vector machine classification of the best performance, highest precision and recall, naive Bayes second, the minimum Decision tree. Also found that classification performance associated not only the choice of the classification algorithm but also the differences between corpus categories.
引用
收藏
页码:1129 / 1131
页数:3
相关论文
共 50 条
  • [1] Text categorization with WEKA: A survey
    Merlini, Donatella
    Rossini, Martina
    MACHINE LEARNING WITH APPLICATIONS, 2021, 4
  • [2] Research of text categorization based on SVM
    Wang, Meihua
    Zhang, Hongbin
    Ding, Renshuang
    2010 INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT (CCCM2010), VOL I, 2010, : 676 - 679
  • [3] Research of Text Categorization Based on SVM
    Wang, Meihua
    Zhang, Hongbin
    Ding, Renshuang
    PROCEEDINGS OF THE 2011 INTERNATIONAL CONFERENCE ON INFORMATICS, CYBERNETICS, AND COMPUTER ENGINEERING (ICCE2011), VOL 2: INFORMATION SYSTEMS AND COMPUTER ENGINEERING, 2011, 111 : 69 - 77
  • [4] COMPARATIVE RESEARCH ON SHORT TEXT CATEGORIZATION
    Chang, Juan
    Lu, Xueqin
    INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE & TECHNOLOGY, PROCEEDINGS, 2009, : 335 - 338
  • [5] Research of Text Categorization Based on Ontology
    Wang Jiayun
    Zhang Rui
    Wang Peng
    PROCEEDINGS OF 2009 CONFERENCE ON COMMUNICATION FACULTY, 2009, : 167 - 170
  • [6] The Improvement Research of Mutual Information Algorithm for Text Categorization
    Kai, Lu
    Li, Chen
    KNOWLEDGE ENGINEERING AND MANAGEMENT , ISKE 2013, 2014, 278 : 225 - 232
  • [7] Research of Text Categorization Model based on Random Forests
    Xue, Dashen
    Li, Fengxin
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION TECHNOLOGY CICT 2015, 2015, : 173 - 176
  • [8] The Research of Text Categorization based on FP-tree
    Zhu, Cuiling
    WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 173 - 177
  • [9] Research on Chinese Text Automatic Categorization Based on VSM
    Tong Xiao-Jun
    Cui Ming-Gen
    Song Guo-Long
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 3863 - +
  • [10] The Research of Tax Text Categorization based on Rough Set
    Liu, Bin
    Xu, Guang
    Xu, Qian
    Zhang, Nan
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 1683 - 1688