Text Classification Based on Naive Bayes Algorithm with Feature Selection

被引:0
作者
Chen, Zhenguo [1 ]
Shi, Guang [1 ]
Wang, Xiaoju [1 ]
机构
[1] N China Inst Sci & Technol, Dept Comp Sci & Technol, Beijing 101601, Peoples R China
来源
INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL | 2012年 / 15卷 / 10期
基金
中国国家自然科学基金;
关键词
Text classification; Naive bayes; Feature selection;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Text Classification is the task to classify documents into predefined classes. It has become one of the key techniques for organizing information. Machine learning, a branch of artificial intelligence, has been used in text classification with better performance than rule based ones. But they mostly need lots of training samples in the processing, which not only brings heavy work for previous data collection, but also require a higher storage and computing resources during the processing. Naive Bayes is one of the most efficient and effective inductive learning algorithms and can get more accurate result in the large training sample set. To improve the performance, feature selection mechanisms are incorporated into naive bayes algorithm. Firstly, feature extraction techniques are applied to remove irrelevant and redundant features. After that, naive bayes classification algorithm is used to text classification. The experimental results have shown that this method keeps high classification accuracy.
引用
收藏
页码:4255 / 4260
页数:6
相关论文
共 50 条
  • [41] Study on the Method of Feature Selection Based on Hybrid Model for Text Classification
    Li, Runzhi
    Zhang, Yangsen
    [J]. MATERIALS SCIENCE AND INFORMATION TECHNOLOGY, PTS 1-8, 2012, 433-440 : 2881 - 2886
  • [42] Adapting naive Bayes tree for text classification
    Wang, Shasha
    Jiang, Liangxiao
    Li, Chaoqun
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 44 (01) : 77 - 89
  • [43] Utility-based feature selection for text classification
    Heyong Wang
    Ming Hong
    Raymond Yiu Keung Lau
    [J]. Knowledge and Information Systems, 2019, 61 : 197 - 226
  • [44] A Scalable Text Classification Using Naive Bayes with Hadoop Framework
    Temesgen, Mulualem Mheretu
    Lemma, Dereje Teferi
    [J]. INFORMATION AND COMMUNICATION TECHNOLOGY FOR DEVELOPMENT FOR AFRICA (ICT4DA 2019), 2019, 1026 : 291 - 300
  • [45] Topic document model approach for naive Bayes text classification
    Kim, SB
    Rim, HC
    Kim, JD
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (05): : 1091 - 1094
  • [46] Improved Naive Bayes with optimal correlation factor for text classification
    Jiangning Chen
    Zhibo Dai
    Juntao Duan
    Heinrich Matzinger
    Ionel Popescu
    [J]. SN Applied Sciences, 2019, 1
  • [47] Human action recognition based on boosted feature selection and naive Bayes nearest-neighbor classification
    Liu, Li
    Shao, Ling
    Rockett, Peter
    [J]. SIGNAL PROCESSING, 2013, 93 (06) : 1521 - 1530
  • [48] Advanced Naive Bayes Text Classifier with Embedded Feature Weighting Approach
    Kim, Han-joon
    [J]. INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2009, 12 (03): : 607 - 620
  • [49] Text and Image Based Spam Email Classification using KNN, Naive Bayes and Reverse DBSCAN Algorithm
    Harisinghaney, Anirudh
    Dixit, Aman
    Gupta, Saurabh
    Arora, Anuja
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON RELIABILTY, OPTIMIZATION, & INFORMATION TECHNOLOGY (ICROIT 2014), 2014, : 153 - 155
  • [50] Naive Bayes Classification Algorithm Based on Optimized Training Data
    Zhu, Xiaodan
    Su, Jinsong
    Wu, Qingfeng
    Dong, Huailin
    [J]. MECHATRONICS AND INTELLIGENT MATERIALS II, PTS 1-6, 2012, 490-495 : 460 - 464