Enhancement of DTP feature selection method for text categorization

被引:0
|
作者
Moyotl-Hernández, E [1 ]
Jiménez-Salazar, H [1 ]
机构
[1] Univ Autonoma Puebla, Fac Ciencias Comp, Puebla 72570, Mexico
来源
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING | 2005年 / 3406卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the structure of vectors obtained by using term selection methods in high-dimensional text collection. We found that the distance to transition point (DTP) method omits commonly occurring terms, which are poor discriminators between documents, but which convey important information about a collection. Experimental results obtained on the Reuters-21578 collection with the k-NN classifier show that feature selection by DTP combined with common terms outperforms slightly simple document frequency.
引用
收藏
页码:719 / 722
页数:4
相关论文
共 50 条
  • [21] Cascaded feature selection in SVMs text categorization
    Masuyama, T
    Nakagawa, H
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PROCEEDINGS, 2003, 2588 : 588 - 591
  • [22] A General Framework of Feature Selection for Text Categorization
    Jing, Hongfang
    Wang, Bin
    Yang, Yahui
    Xu, Yan
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, 2009, 5632 : 647 - +
  • [23] A feature selection and classification technique for text categorization
    Girgis, MR
    Aly, AA
    INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS, 2003, 12 (04) : 441 - 454
  • [24] Text Categorization Based on Clustering Feature Selection
    Zhou, Xiaofei
    Hu, Yue
    Guo, Li
    2ND INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2014, 2014, 31 : 398 - 405
  • [25] An examination of feature selection frameworks in text categorization
    How, BC
    Kiong, WT
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2005, 3689 : 558 - 564
  • [26] Feature selection based on feature interactions with application to text categorization
    Tang, Xiaochuan
    Dai, Yuanshun
    Xiang, Yanping
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 120 : 207 - 216
  • [27] A Method of Feature Selection Based on Word2Vec in Text Categorization
    Tian, Wenfeng
    Li, Jun
    Li, Hongguang
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9452 - 9455
  • [28] Max-difference maximization criterion: a feature selection method for text categorization
    Lingbin Jin
    Li Zhang
    Lei Zhao
    Frontiers of Computer Science, 2023, 17
  • [29] Improved Feature-Selection Method Considering the Imbalance Problem in Text Categorization
    Yang, Jieming
    Qu, Zhaoyang
    Liu, Zhiying
    SCIENTIFIC WORLD JOURNAL, 2014,
  • [30] A NOVEL EMBEDDED FEATURE SELECTION METHOD: A COMPARATIVE STUDY IN THE APPLICATION OF TEXT CATEGORIZATION
    Imani, Maryam Bahojb
    Keyvanpour, Mohammad Reza
    Azmi, Reza
    APPLIED ARTIFICIAL INTELLIGENCE, 2013, 27 (05) : 408 - 427