Graph-Based Approaches for Over-Sampling in the Context of Ordinal Regression

被引:46
作者
Perez-Ortiz, Maria [1 ]
Antonio Gutierrez, Pedro [1 ]
Hervas-Martinez, Cesar [1 ]
Yao, Xin [2 ]
机构
[1] Univ Cordoba, Dept Comp Sci & Numer Anal, E-14004 Cordoba, Spain
[2] Univ Birmingham, CERCIA, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
基金
英国工程与自然科学研究理事会;
关键词
Over-sampling; imbalanced classification; ordinal regression; ordinal classification;
D O I
10.1109/TKDE.2014.2365780
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The classification of patterns into naturally ordered labels is referred to as ordinal regression or ordinal classification. Usually, this classification setting is by nature highly imbalanced, because there are classes in the problem that are a priori more probable than others. Although standard over-sampling methods can improve the classification of minority classes in ordinal classification, they tend to introduce severe errors in terms of the ordinal label scale, given that they do not take the ordering into account. A specific ordinal over-sampling method is developed in this paper for the first time in order to improve the performance of machine learning classifiers. The method proposed includes ordinal information by approaching over-sampling from a graph-based perspective. The results presented in this paper show the good synergy of a popular ordinal regression method (a reformulation of support vector machines) with the graph-based proposed algorithms, and the possibility of improving both the classification and the ordering of minority classes. A cost-sensitive version of the ordinal regression method is also introduced and compared with the over-sampling proposals, showing in general lower performance for minority classes.
引用
收藏
页码:1233 / 1245
页数:13
相关论文
共 32 条
[1]   Evaluation Measures for Ordinal Regression [J].
Baccianella, Stefano ;
Esuli, Andrea ;
Sebastiani, Fabrizio .
2009 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2009, :283-287
[2]  
Barandela R, 2004, LECT NOTES COMPUT SC, V3138, P806
[3]   MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning [J].
Barua, Sukarna ;
Islam, Md. Monirul ;
Yao, Xin ;
Murase, Kazuyuki .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) :405-425
[4]   Towards a theoretical foundation for Laplacian-based manifold methods [J].
Belkin, Mikhail ;
Niyogi, Partha .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2008, 74 (08) :1289-1308
[5]   Modelling ordinal relations with SVMs: An application to objective aesthetic evaluation of breast cancer conservative treatment [J].
Cardoso, JS ;
da Costa, JFP ;
Cardoso, MJ .
NEURAL NETWORKS, 2005, 18 (5-6) :808-817
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]  
Christianini N., 2000, INTRO SUPPORT VECTOR, P189
[8]  
Chu W, 2005, J MACH LEARN RES, V6, P1019
[9]   Support vector ordinal regression [J].
Chu, Wei ;
Keerthi, S. Sathiya .
NEURAL COMPUTATION, 2007, 19 (03) :792-815
[10]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297