Metric Learning from Imbalanced Data with Generalization Guarantees

被引:25
作者
Gautheron, Leo [1 ]
Habrard, Amaury [1 ]
Morvant, Emilie [1 ]
Sebban, Marc [1 ]
机构
[1] Univ Lyon, Inst Opt, Lab Hubert Curien, UJM St Etienne,CNRS,Grad Sch,UMR 5516, F-42023 St Etienne, France
关键词
Imbalanced Data; Classification; Metric Learning; Statistical Machine Learning; Uniform Stability; SMOTE;
D O I
10.1016/j.patrec.2020.03.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since many machine learning algorithms require a distance metric to capture dis/similarities between data points, metric learning has received much attention during the past decade. Surprisingly, very few methods have focused on learning a metric in an imbalanced scenario where the number of positive examples is much smaller than the negatives, and even fewer derived theoretical guarantees in this setting. Here, we address this difficult task and design a new Mahalanobis metric learning algorithm (IML) which deals with class imbalance. We further prove a generalization bound involving the proportion of positive examples using the uniform stability framework. The empirical study performed on a wide range of datasets shows the efficiency of IML. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:298 / 304
页数:7
相关论文
共 38 条
[1]  
[Anonymous], 2004, NIPS
[2]  
[Anonymous], 2013, Outlier Analysis
[3]  
[Anonymous], 2003, WORKSHOP LEARNING IM
[4]  
[Anonymous], ECM L PKDD
[5]  
[Anonymous], 2018, ICML
[6]  
[Anonymous], 2003, NIPS
[7]  
[Anonymous], 2016, ICML
[8]  
[Anonymous], 2008, CVPR
[9]  
Bellet A., 2015, Synthesis Lectures on Artificial Intelligence and Machine Learning, V9, P1, DOI 10.2200/S00626ED1V01Y201501AIM030
[10]   Stability and generalization [J].
Bousquet, O ;
Elisseeff, A .
JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (03) :499-526