Improved Mutual Information Method For Text Feature Selection

被引:0
|
作者
Ding Xiaoming [1 ]
Tang Yan [1 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
来源
PROCEEDINGS OF THE 2013 8TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2013) | 2013年
关键词
text classification; feature selection; mutual information;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Reducing the dimensions of high-dimensional feature set is one of the difficulties of text categorization. Feature selection has been effectively applied in text classification, because of its low complexity of computing. Research works show that mutual information is a good feature selection method but doesn't consider the term frequency in each category of the corpus and the connections between terms. To remedying the defects of traditional mutual information method, this article improved measure of mutual information by introducing the feature frequency in class and the dispersion of feature in class, and built a experimental platform by constructing a Chinese text classification system, and did a multi-set of experiments base on this system. The results show that the new feature selection approach has a more excellent effect in text categorization.
引用
收藏
页码:163 / 166
页数:4
相关论文
共 50 条
  • [21] Improved Relief Weight Feature Selection Algorithm Based on Relief and Mutual Information
    Wang, Hongbin
    Wang, Pengming
    Deng, Shengchun
    Li, Haoran
    INFORMATION, 2021, 12 (06)
  • [22] Feature Selection Method Based on Mutual Information and Support Vector Machine
    Liu, Gang
    Yang, Chunlei
    Liu, Sen
    Xiao, Chunbao
    Song, Bin
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (06)
  • [23] Feature Selection Method Based on Weighted Mutual Information for Imbalanced Data
    Li, Kewen
    Yu, Mingxiao
    Liu, Lu
    Li, Timing
    Zhai, Jiannan
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2018, 28 (08) : 1177 - 1194
  • [24] A Fuzzy Mutual Information-based Feature Selection Method for Classification
    Hogue, N.
    Ahmed, H. A.
    Bhattacharyya, D. K.
    Kalita, J. K.
    FUZZY INFORMATION AND ENGINEERING, 2016, 8 (03) : 355 - 384
  • [25] An improved feature transformation method using mutual information
    Bassir, Seyed
    Akbari, Ahmad
    Nassersharif, Babak
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2014, 17 (02) : 107 - 115
  • [26] Feature selection and threshold method based on fuzzy joint mutual information
    Salem, Omar A. M.
    Liu, Feng
    Chen, Yi-Ping Phoebe
    Chen, Xi
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2021, 132 : 107 - 126
  • [27] Semantic similarity-aware feature selection and redundancy removal for text classification using joint mutual information
    Lazhar, Farek
    Amira, Benaidja
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (10) : 6187 - 6212
  • [28] A Novel Feature Selection Method on Mutual Information and Improved Gravitational Search Algorithm for High Dimensional Biomedical Data
    Yan, Chaokun
    Kang, Xi
    Li, Mengyuan
    Wang, Jianlin
    2021 THE 13TH INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2021), 2021, : 24 - 30
  • [29] An enhanced text categorization method based on improved text frequency approach and mutual information algorithm
    Pei Zhili
    Shi Xiaohu
    Marchese, Maurizio
    Liang Yanchun
    PROGRESS IN NATURAL SCIENCE-MATERIALS INTERNATIONAL, 2007, 17 (12) : 1494 - 1500
  • [30] An enhanced text categorization method based on improved text frequency approach and mutual information algorithm
    Maurizio Marchese
    ProgressinNaturalScience, 2007, (12) : 1494 - 1500