Text Classification Based on Naive Bayes Algorithm with Feature Selection

被引:0
|
作者
Chen, Zhenguo [1 ]
Shi, Guang [1 ]
Wang, Xiaoju [1 ]
机构
[1] N China Inst Sci & Technol, Dept Comp Sci & Technol, Beijing 101601, Peoples R China
来源
INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL | 2012年 / 15卷 / 10期
基金
中国国家自然科学基金;
关键词
Text classification; Naive bayes; Feature selection;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Text Classification is the task to classify documents into predefined classes. It has become one of the key techniques for organizing information. Machine learning, a branch of artificial intelligence, has been used in text classification with better performance than rule based ones. But they mostly need lots of training samples in the processing, which not only brings heavy work for previous data collection, but also require a higher storage and computing resources during the processing. Naive Bayes is one of the most efficient and effective inductive learning algorithms and can get more accurate result in the large training sample set. To improve the performance, feature selection mechanisms are incorporated into naive bayes algorithm. Firstly, feature extraction techniques are applied to remove irrelevant and redundant features. After that, naive bayes classification algorithm is used to text classification. The experimental results have shown that this method keeps high classification accuracy.
引用
收藏
页码:4255 / 4260
页数:6
相关论文
共 50 条
  • [1] Feature selection for text classification with Naive Bayes
    Chen, Jingnian
    Huang, Houkuan
    Tian, Shengfeng
    Qu, Youli
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) : 5432 - 5435
  • [2] Divergence-Based Feature Selection for Naive Bayes Text Classification
    Wang, Huizhen
    Zhu, Jingbo
    Su, Keh-Yih
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 209 - +
  • [3] Feature subset selection using naive Bayes for text classification
    Feng, Guozhong
    Guo, Jianhua
    Jing, Bing-Yi
    Sun, Tieli
    PATTERN RECOGNITION LETTERS, 2015, 65 : 109 - 115
  • [4] Chinese News Text Multi Classification Based on Naive Bayes Algorithm
    Wang, Fei
    Deng, Xin
    Hou, Lunqing
    ISCSIC'18: PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND INTELLIGENT CONTROL, 2018,
  • [5] An Improvement to Naive Bayes for Text Classification
    Zhang, Wei
    Gao, Feng
    CEIS 2011, 2011, 15
  • [6] A New Feature Selection Approach to Naive Bayes Text Classifiers
    Zhang, Lungan
    Jiang, Liangxiao
    Li, Chaoqun
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (02)
  • [7] Improved feature size customized fast correlation-based filter for Naive Bayes text classification
    Zhang, Yun
    Zhang, Yude
    He, Wei
    Yu, Shujuan
    Zhao, Shengmei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (03) : 3117 - 3127
  • [8] Feature selection for multi-label naive Bayes classification
    Zhang, Min-Ling
    Pena, Jose M.
    Robles, Victor
    INFORMATION SCIENCES, 2009, 179 (19) : 3218 - 3229
  • [9] Deep feature weighting for naive Bayes and its application to text classification
    Jiang, Liangxiao
    Li, Chaoqun
    Wang, Shasha
    Zhang, Lungan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 52 : 26 - 39
  • [10] HYBRID FEATURE SELECTION APPROACH USING BACTERIAL FORAGING ALGORITHM GUIDED BY NAIVE BAYES CLASSIFICATION
    Mittal, Divya
    Bala, Manju
    2017 8TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2017,