Text Classification Based on Naive Bayes Algorithm with Feature Selection

被引:0
作者
Chen, Zhenguo [1 ]
Shi, Guang [1 ]
Wang, Xiaoju [1 ]
机构
[1] N China Inst Sci & Technol, Dept Comp Sci & Technol, Beijing 101601, Peoples R China
来源
INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL | 2012年 / 15卷 / 10期
基金
中国国家自然科学基金;
关键词
Text classification; Naive bayes; Feature selection;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Text Classification is the task to classify documents into predefined classes. It has become one of the key techniques for organizing information. Machine learning, a branch of artificial intelligence, has been used in text classification with better performance than rule based ones. But they mostly need lots of training samples in the processing, which not only brings heavy work for previous data collection, but also require a higher storage and computing resources during the processing. Naive Bayes is one of the most efficient and effective inductive learning algorithms and can get more accurate result in the large training sample set. To improve the performance, feature selection mechanisms are incorporated into naive bayes algorithm. Firstly, feature extraction techniques are applied to remove irrelevant and redundant features. After that, naive bayes classification algorithm is used to text classification. The experimental results have shown that this method keeps high classification accuracy.
引用
收藏
页码:4255 / 4260
页数:6
相关论文
共 50 条
  • [21] Fast Feature Selection for Naive Bayes Classification in Data Stream Mining
    Lutu, Patricia E. N.
    WORLD CONGRESS ON ENGINEERING - WCE 2013, VOL III, 2013, : 1549 - 1554
  • [22] Feature selection for text classification: A review
    Deng, Xuelian
    Li, Yuqing
    Weng, Jian
    Zhang, Jilian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (03) : 3797 - 3816
  • [23] Text Classification on Mahout with Naive-Bayes Machine Learning Algorithm
    Salur, Mehmet Umut
    Tokat, Sezai
    Aydilek, Ibrahim Berkan
    2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,
  • [24] Supervised Hebb rule based feature selection for text classification
    Heyong, Wang
    Ming, Hong
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (01) : 167 - 191
  • [25] Semantic Text Classification with Tensor Space Model-based Naive Bayes
    Kim, Han-joon
    Kim, Jiyun
    Kim, Jinseog
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 4206 - 4210
  • [26] Integrating incremental feature weighting into Naive Bayes text classifier
    Kim, Han Joon
    Chang, Jaeyoung
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 1137 - 1143
  • [27] Parallel naive Bayes algorithm for large-scale Chinese text classification based on spark
    Liu Peng
    Zhao Hui-han
    Teng Jia-yu
    Yang Yan-yan
    Liu Ya-feng
    Zhu Zong-wei
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2019, 26 (01) : 1 - 12
  • [28] Text-Based Gender Classification of Twitter Data using Naive Bayes and SVM Algorithm
    Angeles, Angelic
    Quintos, Maria Nikki
    Octaviano, Manolito, Jr.
    Raga, Rodolofo, Jr.
    2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 522 - 526
  • [29] A Two-stage Text Feature Selection Algorithm for Improving Text Classification
    Ashokkumar, P.
    Shankar, Siva G.
    Srivastava, Gautam
    Maddikunta, Praveen Kumar Reddy
    Gadekallu, Thippa Reddy
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (03)
  • [30] Classification Algorithm for Naive Bayes Based on Validity and Correlation
    Dong, Huailin
    Zhu, Xiaodan
    Wu, Qingfeng
    Huang, Juanjuan
    SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS, PTS 1-4, 2013, 303-306 : 1609 - 1612