Distance Metric Learning with Prototype Selection for Imbalanced Classification

被引:2
作者
Luis Suarez, Juan [1 ]
Garcia, Salvador [1 ]
Herrera, Francisco [1 ]
机构
[1] Univ Granada, Andalusian Res Inst Data Sci & Computat Intellige, Dept Comp Sci & Artificial Intelligence, DaSCI, Granada 18071, Spain
来源
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2021 | 2021年 / 12886卷
关键词
Distance metric learning; Imbalanced classification; Nearest neighbors; Neighborhood components analysis; Undersampling; NEIGHBOR; SMOTE;
D O I
10.1007/978-3-030-86271-8_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distance metric learning is a discipline that has recently become popular, due to its ability to significantly improve similarity-based learning methods, such as the nearest neighbors classifier. Most proposals related to this topic focus on standard supervised learning and weak-supervised learning problems. In this paper, we propose a distance metric learning method to handle imbalanced classification via prototype selection. Our method, which we have called condensed neighborhood components analysis (CNCA), is an improvement of the classic neighborhood components analysis, to which foundations of the condensed nearest neighbors undersampling method are added. We show how to implement this algorithm, and provide a Python implementation. We have also evaluated its performance over imbalanced classification problems, resulting in very good performance using several imbalanced score metrics.
引用
收藏
页码:391 / 402
页数:12
相关论文
共 26 条
[1]  
Benavoli A, 2014, PR MACH LEARN RES, V32, P1026
[2]  
Benavoli A, 2017, J MACH LEARN RES, V18
[3]   A Survey of Predictive Modeling on Im balanced Domains [J].
Branco, Paula ;
Torgo, Luis ;
Ribeiro, Rita P. .
ACM COMPUTING SURVEYS, 2016, 49 (02)
[4]   rNPBST: An R Package Covering Non-parametric and Bayesian Statistical Tests [J].
Carrasco, Jacinto ;
Garcia, Salvador ;
del Mar Rueda, Maria ;
Herrera, Francisco .
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2017, 2017, 10334 :281-292
[5]  
Chang F, 2006, J MACH LEARN RES, V7, P2125
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[8]  
Cunningham JP, 2015, J MACH LEARN RES, V16, P2859
[9]   An incremental prototype set building technique [J].
Devi, VS ;
Murty, MN .
PATTERN RECOGNITION, 2002, 35 (02) :505-513
[10]   SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary [J].
Fernandez, Alberto ;
Garcia, Salvador ;
Herrera, Francisco ;
Chawla, Nitesh V. .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2018, 61 :863-905