<bold>USING MULTIPLE SETS OF ATTRIBUTES FOR TEXT CATEGORIZATION</bold>

被引:0
作者
Bi, Ya-Xin [1 ]
Zhang, Qiang [2 ]
Wu, Sheno-Li [1 ]
Guan, Ji-Wen [3 ]
机构
[1] Univ Ulster, Sch Comp & Math, Newtownabbey BT37 0QB, Antrim, North Ireland
[2] Baicheng Normal Coll, Dept Comp, Baicheng 137000, Peoples R China
[3] Queens Univ Belfast, Sch Comp Sci, Belfast, Antrim BT71NN, North Ireland
来源
PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2006年
关键词
inductive learning; ensemble methods; information fusion; text categorization;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper investigates how multiple sets of attributes can be generated using a rough sets-based inductive learning method and how they can be combined for improving classification decisions, particularly in the context of text categorization, by using Dempster's rule of combination. We first propose a boosting-like technique for generating multiple sets of attributes based on rough set theory, and a method for transforming multiple sets of attributes to multiple sets of rules, and then model classification decisions inferred by the rules as pieces of evidence. The various experiments have been carried out on 10 out of the 20-newsgroups - a benchmark data collection - individually and in combination. Our experimental results support the claim that "decisions made by multiple experts would be more effective than any one if their individual judgments are appropriately combined".
引用
收藏
页码:2252 / +
页数:2
相关论文
共 17 条
[1]   AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION [J].
APTE, C ;
DAMERAU, F ;
WEISS, SM .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) :233-251
[2]  
Bi YX, 2004, LECT NOTES ARTIF INT, V3215, P521
[3]  
Bi YX, 2004, LECT NOTES COMPUT SC, V3180, P222
[4]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15
[5]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[6]  
Freund Y, 1996, ICML
[7]  
Hart, 2006, PATTERN CLASSIFICATI
[8]  
Joachims T., 1998, P 10 EUR C MACH LEAR, P28
[9]  
KLEIN L.A., 1999, SENSOR DATA FUSION C
[10]  
Mitchell Tom M., 1999, COMMUNICATIONS ACM, V42