Multilabel text categorization based on a new linear classifier learning method and a category-sensitive refinement method

被引:19
作者
Chang, Yu-Chuan [1 ]
Chen, Shyi-Ming [1 ,2 ]
Liau, Churn-Jung [3 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Comp Sci & Informat Engn, Taipei 106, Taiwan
[2] Jinwen Univ Sci & Technol, Dept Comp Sci & Informat Engn, Taipei, Taiwan
[3] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
关键词
text categorization; text classifiers; category-sensitive refinement method; multilabel text categorization; relevance scores;
D O I
10.1016/j.eswa.2007.02.037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present a new approach for dealing with multilabel text categorization based on a new linear classifier learning method and a category-sensitive refinement method. We use a new weighted indexing technique to construct a multilabel linear classifier. We use the degrees of similarity between categories to adjust the relevance scores of categories with respect to a testing document. The testing document can be properly classified into multiple categories by using a predefined threshold value. We also compare the performance of the proposed method with the text categorization methods based on the Reuters-21578 ModeApte Split Text Collection. The experimental results show that the performance of the proposed method is better than the existing methods. (c) 2007 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1948 / 1953
页数:6
相关论文
共 22 条
[1]  
[Anonymous], 1997, Proceedings of the fourteenth international conference on machine learning, DOI DOI 10.1016/J.ESWA.2008.05.026
[2]   AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION [J].
APTE, C ;
DAMERAU, F ;
WEISS, SM .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) :233-251
[3]  
Baeza-Yates R., 1999, Modern Information Retrieval, Book
[4]  
Caropreso MF, 2001, TEXT DATABASES AND DOCUMENT MANAGEMENT: THEORY AND PRACTICE, P78
[5]  
CHANG YC, 2006, P JOINT 3 INT C SOFT, P1020
[6]  
CHANG YC, 2006, P 19 INT C IND ENG A, P1249
[7]   Context-sensitive learning methods for text categorization [J].
Cohen, WW ;
Singer, Y .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1999, 17 (02) :141-173
[8]  
Dumais S., 1998, Proceedings of the 1998 ACM CIKM International Conference on Information and Knowledge Management, P148, DOI 10.1145/288627.288651
[9]   Automated categorization of German-language patent documents [J].
Fall, CJ ;
Törcsvári, A ;
Fiévet, P ;
Karetka, G .
EXPERT SYSTEMS WITH APPLICATIONS, 2004, 26 (02) :269-277
[10]   A PROBABILISTIC LEARNING APPROACH FOR DOCUMENT INDEXING [J].
FUHR, N ;
BUCKLEY, C .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1991, 9 (03) :223-248