A weighted rough set method to address the class imbalance problem

被引:0
作者
Liu, Jin-Fu [1 ]
Yu, Da-Ren [1 ]
机构
[1] Harbin Inst Technol, Harbin 150001, Peoples R China
来源
PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7 | 2007年
关键词
rough sets; class imbalance learning; instance weighting; weighted entropy; rule extraction;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The class imbalance problem has been said recently to hinder the performance of learning systems. Most of traditional learning algorithms are designed with the assumption of well-balanced datasets, and are biased towards the majority class and thus may predict poorly the minority class examples. In this paper, we develop weighted rough sets (WRS) to deal with this problem. In weighted rough sets, weighted entropy is introduced and extended to compute the information content introduced by attributes. A forward greedy weighted attribute reduction algorithm based on the weighted entropy and a weighted rule extraction algorithm are provided. The factors of weighted strength, weighted certainty and weighted cover are employed to evaluate the extracted rules. Finally, a decision algorithm based on the weighted strength factor is constructed. Based on weighted rough sets, a series of experiments on class imbalance learning are conducted on 20 UCI data sets. In the meaning of AUC and minority class accuracy, WRS achieves the better results than classical rough set in class imbalance learning. Moreover, the evaluation of extracted rules has greater influence than the selection of attributes on weighted rough set learning.
引用
收藏
页码:3693 / 3698
页数:6
相关论文
共 20 条
[1]  
[Anonymous], UCI REPOSITORY MACHI
[2]  
[Anonymous], P EUR C MACH LEARNIN
[3]  
CHAWLA N, ICML 2003 WORKSH LEA
[4]  
Chawla N. V., 2004, ACM Sigkdd Explorations Newsletter, V6, P1, DOI [DOI 10.1145/1007730.1007733, 10.1145/1007730.1007733]
[5]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[6]   Adaptive fraud detection [J].
Fawcett, T ;
Provost, F .
DATA MINING AND KNOWLEDGE DISCOVERY, 1997, 1 (03) :291-316
[7]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[8]  
FAYYAD U, 1996, P 13 INT C MACH LEAR, P157
[9]  
Grzymala-Busse J.W., 1992, Intelligent Decision Support, P3, DOI DOI 10.1007/978-94-015-7975-9_
[10]  
Guiasu S., 1977, INFORM THEORY APPL