Scalable Large-Margin Mahalanobis Distance Metric Learning

被引：56

作者：

Shen, Chunhua ^{[1
,2
]}

Kim, Junae ^{[2
]}

Wang, Lei ^{[2
]}

机构：

[1] NICTA, Canberra Res Lab, Canberra, ACT 2601, Australia

[2] Australian Natl Univ, Canberra, ACT 0200, Australia

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS | 2010年 / 21卷 / 09期

关键词：

Distance metric learning; large-margin nearest neighbor; Mahalanobis distance; semidefinite optimization; MATRIX;

D O I：

10.1109/TNN.2010.2052630

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For many machine learning algorithms such as k-nearest neighbor (k-NN) classifiers and k-means clustering, often their success heavily depends on the metric used to calculate distances between different data points. An effective solution for defining such a metric is to learn it from a set of labeled training samples. In this work, we propose a fast and scalable algorithm to learn a Mahalanobis distance metric. The Mahalanobis metric can be viewed as the Euclidean distance metric on the input data that have been linearly transformed. By employing the principle of margin maximization to achieve better generalization performances, this algorithm formulates the metric learning as a convex optimization problem and a positive semidefinite (p.s.d.) matrix is the unknown variable. Based on an important theorem that a p.s.d. trace-one matrix can always be represented as a convex combination of multiple rank-one matrices, our algorithm accommodates any differentiable loss function and solves the resulting optimization problem using a specialized gradient descent procedure. During the course of optimization, the proposed algorithm maintains the positive semidefiniteness of the matrix variable that is essential for a Mahalanobis metric. Compared with conventional methods like standard interior-point algorithms [2] or the special solver used in large margin nearest neighbor [24], our algorithm is much more efficient and has a better performance in scalability. Experiments on benchmark data sets suggest that, compared with state-of-the-art metric learning algorithms, our algorithm can achieve a comparable classification accuracy with reduced computational complexity.

引用

页码：1524 / 1530

页数：7

共 27 条

[11] Granas A, 2007, J FIX POINT THEORY A, V1, P1, DOI 10.1007/s11784-006-0010-5
[12] A comparison of methods for multiclass support vector machines
Hsu, CW
Lin, CJ
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (02): : 415 - 425
[13] KIM J, 2009, LNCS, V5996, P299
[14] Kulis B, 2009, J MACH LEARN RES, V10, P341
[15] Lanckriet GRG, 2004, J MACH LEARN RES, V5, P27
[16] Lowe D, 1999, P 7 IEEE INT C COMP, V2, P1150, DOI [10.1109/ICCV.1999.790410, DOI 10.1109/ICCV.1999.790410]
[17] Scale & affine invariant interest point detectors
Mikolajczyk, K
Schmid, C
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (01) : 63 - 86
[18] Rosales R., 2006, P 12 ACM SIGKDD INT, P367, DOI [DOI 10.1145/1150402.1150444, 10.1145/1150402.1150444]
[19] Shen C., 2008, Advanced Neural Information Processing Systems, P1473
[20] Content-based image retrieval at the end of the early years
Smeulders, AWM
Worring, M
Santini, S
Gupta, A
Jain, R
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (12) : 1349 - 1380

← 1 2 3 →