Geometric SMOTE for regression

被引:45
作者
Camacho, Luis [1 ]
Douzas, Georgios [1 ]
Bacao, Fernando [1 ]
机构
[1] Univ Nova Lisboa, NOVA Informat Management Sch NOVA IMS, Campus Campolide, P-1070312 Lisbon, Portugal
关键词
Imbalanced; Regression; Data-level; IMBALANCED DATA; CHALLENGES;
D O I
10.1016/j.eswa.2021.116387
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning from imbalanced data sets is known to be a challenging task. There are many proposals to tackle the challenge for classification problems, but regarding regression the solutions are few. In the context of regression, imbalanced learning means that there is a concern with the accurate prediction of the target values in a subset of the continuous target variable, considering that these values rarely occur in the data set. In this article, we extend the G-SMOTE algorithm that is used in classification to regression tasks. G-SMOTE is a preprocessing algorithm that differs from the SMOTE algorithm as it allows the generation of synthetic instances in a geometric region around the selected instances rather than in the line segment that joins the two selected instances. The performance of G-SMOTE for regression was compared against other methods, and the empirical results show that our proposal outperformed those methods.
引用
收藏
页数:8
相关论文
共 31 条
[1]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[2]  
[Anonymous], 2011, Utility-based Regression
[3]   Pre-processing approaches for imbalanced distributions in regression [J].
Branco, Paula ;
Torgo, Luis ;
Ribeiro, Rita P. .
NEUROCOMPUTING, 2019, 343 :76-99
[4]   A Survey of Predictive Modeling on Im balanced Domains [J].
Branco, Paula ;
Torgo, Luis ;
Ribeiro, Rita P. .
ACM COMPUTING SURVEYS, 2016, 49 (02)
[5]  
Breiman L., 2001, IEEE Trans. Broadcast., V45, P5
[6]   DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique [J].
Bunkhumpornpat, Chumphol ;
Sinapiromsaran, Krung ;
Lursinsap, Chidchanok .
APPLIED INTELLIGENCE, 2012, 36 (03) :664-684
[7]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[8]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[9]   Geometric SMOTE a geometrically enhanced drop-in replacement for SMOTE [J].
Douzas, Georgios ;
Bacao, Fernando .
INFORMATION SCIENCES, 2019, 501 :118-135
[10]   Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE [J].
Douzas, Georgios ;
Bacao, Fernando ;
Last, Felix .
INFORMATION SCIENCES, 2018, 465 :1-20