Two feature weighting approaches for naive Bayes text classifiers

被引:79
作者
Zhang, Lungan [1 ]
Jiang, Liangxiao [1 ,2 ]
Li, Chaoqun [3 ]
Kong, Ganggang [1 ]
机构
[1] China Univ Geosci, Dept Comp Sci, Wuhan 430074, Peoples R China
[2] China Univ Geosci, Hubei Key Lab Intelligent Geoinformat Proc, Wuhan 430074, Peoples R China
[3] China Univ Geosci, Dept Math, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Naive Bayes text classifiers; Feature weighting; Gain ratio; Decision tree;
D O I
10.1016/j.knosys.2016.02.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper works on feature weighting approaches for naive Bayes text classifiers. Almost all existing feature weighting approaches for naive Bayes text classifiers have some defects: limited improvement to classification performance of naive Bayes text classifiers or sacrificing the simplicity and execution time of the final models. In fact, feature weighting is not new for machine learning community, and many researchers have made fruitful efforts in the field of feature weighting. This paper reviews some simple and efficient feature weighting approaches designed for standard naive Bayes classifiers, and adapts them for naive Bayes text classifiers. As a result, this paper proposes two adaptive feature weighting approaches for naive Bayes text classifiers. Experimental results based on benchmark and real-world data show that, compared to their competitors, our feature weighting approaches show higher classification accuracy, yet at the same time maintain the simplicity and lower execution time of the final models. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:137 / 144
页数:8
相关论文
共 36 条
  • [21] On the performance of feature weighting K-means for text subspace clustering
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZX
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2005, 3739 : 502 - 512
  • [22] Statistical computation of feature weighting schemes through data estimation for nearest neighbor classifiers
    Saez, Jose A.
    Derrac, Joaquin
    Luengo, Julian
    Herrera, Francisco
    PATTERN RECOGNITION, 2014, 47 (12) : 3941 - 3948
  • [23] CWC: A clustering-based feature weighting approach for text classification
    Zhu, Lin
    Guan, Jihong
    Zhou, Shuigeng
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4617 : 204 - +
  • [24] Study On Feature Selection And Weighting Based On Synonym Merge In Text Categorization
    Lu, Zhenyu
    Lin, Yongmin
    Zhao, Shuang
    Chen, Xuebin
    SECOND INTERNATIONAL CONFERENCE ON FUTURE NETWORKS: ICFN 2010, 2010, : 105 - 109
  • [25] Automatic image annotation base on Naive Bayes and Decision Tree classifiers using MPEG-7
    Majidpour, Jafar
    Jameel, Samer Kais
    2019 IEEE 5TH CONFERENCE ON KNOWLEDGE BASED ENGINEERING AND INNOVATION (KBEI 2019), 2019, : 7 - 12
  • [26] Subspace clustering of text documents with feature weighting K-means algorithm
    Jing, LP
    Ng, MK
    Xu, J
    Huang, JZ
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2005, 3518 : 802 - 812
  • [27] Comparison between two coevolutionary feature weighting algorithms in clustering
    Gancarski, P.
    Blansche, A.
    Wania, A.
    PATTERN RECOGNITION, 2008, 41 (03) : 983 - 994
  • [28] Activity recognition in a smart home using local feature weighting and variants of nearest-neighbors classifiers
    Labiba Gillani Fahad
    Syed Fahad Tahir
    Journal of Ambient Intelligence and Humanized Computing, 2021, 12 : 2355 - 2364
  • [29] Activity recognition in a smart home using local feature weighting and variants of nearest-neighbors classifiers
    Fahad, Labiba Gillani
    Tahir, Syed Fahad
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (02) : 2355 - 2364
  • [30] Feature Weighting Method Based on Real-coded Genetic Algorithm in Text Categorization
    Li, Junwei
    Li, Xiangqian
    2015 8TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2015, : 91 - 94