Improved Mutual Information Method For Text Feature Selection

被引:0
|
作者
Ding Xiaoming [1 ]
Tang Yan [1 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
来源
PROCEEDINGS OF THE 2013 8TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2013) | 2013年
关键词
text classification; feature selection; mutual information;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Reducing the dimensions of high-dimensional feature set is one of the difficulties of text categorization. Feature selection has been effectively applied in text classification, because of its low complexity of computing. Research works show that mutual information is a good feature selection method but doesn't consider the term frequency in each category of the corpus and the connections between terms. To remedying the defects of traditional mutual information method, this article improved measure of mutual information by introducing the feature frequency in class and the dispersion of feature in class, and built a experimental platform by constructing a Chinese text classification system, and did a multi-set of experiments base on this system. The results show that the new feature selection approach has a more excellent effect in text categorization.
引用
收藏
页码:163 / 166
页数:4
相关论文
共 50 条
  • [1] Discriminant Mutual Information for Text Feature Selection
    Wang, Jiaqi
    Zhang, Li
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 136 - 151
  • [2] Feature selection algorithm for text classification based on improved mutual information
    丛帅
    张积宾
    徐志明
    王宇颖
    Journal of Harbin Institute of Technology(New series), 2011, (03) : 144 - 148
  • [3] Study of E-mail Filtering Based on Mutual Information Text Feature Selection Method
    Gong, Shangfu
    Gong, Xingyu
    Wang, Yuan
    INSTRUMENTATION, MEASUREMENT, CIRCUITS AND SYSTEMS, 2012, 127 : 33 - 39
  • [4] Mutual Information Using Sample Variance for Text Feature Selection
    Agnihotri, Deepak
    Verma, Kesari
    Tripathi, Priyanka
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION PROCESSING (ICCIP 2017), 2017, : 39 - 44
  • [5] Feature Selection for Text Classification Using Mutual Information
    Sel, Ilhami
    Karci, Ali
    Hanbay, Davut
    2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [6] Spam Feature Selection Based on the Improved Mutual Information Algorithm
    Liang Ting
    Yu Qingsong
    2012 FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION NETWORKING AND SECURITY (MINES 2012), 2012, : 67 - 70
  • [7] An Improved Text Feature Selection Method for Transfer Learning
    Liu, Jiang
    Wang, Hao
    Liu, Jun
    CONTEMPORARY RESEARCH ON E-BUSINESS TECHNOLOGY AND STRATEGY, 2012, 332 : 600 - +
  • [8] A Clustering Based Feature Selection Method Using Feature Information Distance for Text Data
    Chao, Shilong
    Cai, Jie
    Yang, Sheng
    Wang, Shulin
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2016, PT I, 2016, 9771 : 122 - 132
  • [9] A new feature selection method for handling redundant information in text classification
    You-wei Wang
    Li-zhou Feng
    Frontiers of Information Technology & Electronic Engineering, 2018, 19 : 221 - 234
  • [10] An improved text feature selection method for transfer learning
    Liu, Jiang
    Wang, Hao
    Liu, Jun
    Communications in Computer and Information Science, 2013, 332 : 600 - 611