Improved Mutual Information Method For Text Feature Selection

被引:0
|
作者
Ding Xiaoming [1 ]
Tang Yan [1 ]
机构
[1] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
来源
PROCEEDINGS OF THE 2013 8TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2013) | 2013年
关键词
text classification; feature selection; mutual information;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Reducing the dimensions of high-dimensional feature set is one of the difficulties of text categorization. Feature selection has been effectively applied in text classification, because of its low complexity of computing. Research works show that mutual information is a good feature selection method but doesn't consider the term frequency in each category of the corpus and the connections between terms. To remedying the defects of traditional mutual information method, this article improved measure of mutual information by introducing the feature frequency in class and the dispersion of feature in class, and built a experimental platform by constructing a Chinese text classification system, and did a multi-set of experiments base on this system. The results show that the new feature selection approach has a more excellent effect in text categorization.
引用
收藏
页码:163 / 166
页数:4
相关论文
共 50 条
  • [41] A General Framework for Class Label Specific Mutual Information Feature Selection Method
    Rakesh, Deepak Kumar
    Jana, Prasanta K.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (12) : 7996 - 8014
  • [42] A non-redundant feature selection method for text categorization based on term co-occurrence frequency and mutual information
    Farek, Lazhar
    Benaidja, Amira
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (07) : 20193 - 20214
  • [43] A non-redundant feature selection method for text categorization based on term co-occurrence frequency and mutual information
    Lazhar Farek
    Amira Benaidja
    Multimedia Tools and Applications, 2024, 83 : 20193 - 20214
  • [44] Relevance assignation feature selection method based on mutual information for machine learning
    Gao, Liyang
    Wu, Weiguo
    KNOWLEDGE-BASED SYSTEMS, 2020, 209
  • [45] Time Series Feature Selection Method Based on Mutual Information
    Huang, Lin
    Zhou, Xingqiang
    Shi, Lianhui
    Gong, Li
    APPLIED SCIENCES-BASEL, 2024, 14 (05):
  • [46] A parallel feature selection method study for text classification
    Zhao Li
    Wei Lu
    Zhanquan Sun
    Weiwei Xing
    Neural Computing and Applications, 2017, 28 : 513 - 524
  • [47] Feature selection method on imbalanced text
    Liao, Yi-Xing
    Pan, Xue-Zeng
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2012, 41 (04): : 592 - 595
  • [48] A parallel feature selection method study for text classification
    Li, Zhao
    Lu, Wei
    Sun, Zhanquan
    Xing, Weiwei
    NEURAL COMPUTING & APPLICATIONS, 2017, 28 : S513 - S524
  • [49] Quadratic Mutual Information Feature Selection
    Sluga, Davor
    Lotric, Uros
    ENTROPY, 2017, 19 (04)
  • [50] Improved Expected Cross Entropy Method For Text Feature Selection
    Wu, Guohua
    Wang, Liuyang
    Zhao, Nailiang
    Lin, Hairong
    2015 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND MECHANICAL AUTOMATION (CSMA), 2015, : 49 - 54