Enhancement of DTP feature selection method for text categorization

被引:0
|
作者
Moyotl-Hernández, E [1 ]
Jiménez-Salazar, H [1 ]
机构
[1] Univ Autonoma Puebla, Fac Ciencias Comp, Puebla 72570, Mexico
来源
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING | 2005年 / 3406卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the structure of vectors obtained by using term selection methods in high-dimensional text collection. We found that the distance to transition point (DTP) method omits commonly occurring terms, which are poor discriminators between documents, but which convey important information about a collection. Experimental results obtained on the Reuters-21578 collection with the k-NN classifier show that feature selection by DTP combined with common terms outperforms slightly simple document frequency.
引用
收藏
页码:719 / 722
页数:4
相关论文
共 50 条
  • [31] Combination of modified BPNN algorithms and an efficient feature selection method for text categorization
    Li, Cheng Hua
    Park, Soon Cheol
    INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (03) : 329 - 340
  • [32] Max-difference maximization criterion:a feature selection method for text categorization
    Lingbin JIN
    Li ZHANG
    Lei ZHAO
    Frontiers of Computer Science, 2023, 17 (01) : 231 - 233
  • [33] Max-difference maximization criterion: a feature selection method for text categorization
    Jin, Lingbin
    Zhang, Li
    Zhao, Lei
    FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (01)
  • [34] Applying cascaded feature selection to SVM text categorization
    Masuyama, T
    Nakagawa, H
    13TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2002, : 241 - 245
  • [35] Feature selection for support vector machines in text categorization
    Liu, Y
    Lu, HM
    Lu, ZX
    Wang, P
    MLMTA'03: INTERNATIONAL CONFERENCE ON MACHINE LEARNING; MODELS, TECHNOLOGIES AND APPLICATIONS, 2003, : 129 - 134
  • [36] Feature Selection with Structural Sparse Mode for Text Categorization
    Zheng, Wenbin
    Tang, Dan
    Zhang, Haiqing
    Tang, Hong
    2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 1, 2017, : 359 - 362
  • [37] PKIP: Feature selection in text categorization for item banks
    Nuntiyagul, A
    Naruedomkul, K
    Cercone, N
    Wongsawang, D
    ICTAI 2005: 17TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, : 212 - 216
  • [38] Feature subset selection in SOM based text categorization
    Bassiouny, S
    Nagi, M
    Hussein, MF
    IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 860 - 866
  • [39] Using typical testors for feature selection in text categorization
    Pons-Porratal, Aurora
    Gil-Garcia, Reynaldo
    Berlanga-Liavori, Rafael
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 643 - +
  • [40] Maximum entropy modeling with feature selection for text categorization
    Cai, Jihong
    Song, Fei
    INFORMATION RETRIEVAL TECHNOLOGY, 2008, 4993 : 549 - 554