A Comprehensive Look at Coding Techniques on Riemannian Manifolds

被引：10

作者：

Faraki, Masoud ^{[1
,2
,3
]}

Harandi, Mehrtash T. ^{[1
,2
]}

Porikli, Fatih ^{[1
]}

机构：

[1] Australian Natl Univ, Res Sch Engn, Canberra, ACT 0200, Australia

[2] CSIRO, Data61, Canberra, ACT 2601, Australia

[3] Monash Univ, Australian Ctr Robot Vis, Melbourne, Vic 3800, Australia

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2018年 / 29卷 / 11期

关键词：

Bag of words (BoW); collaborative coding (CC); locality-constrained linear coding (LLC); Riemannian geometry; sparse coding (SC); vector of locally aggregated descriptors (VLADs); CLASSIFICATION; RECOGNITION;

D O I：

10.1109/TNNLS.2018.2812799

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Core to many learning pipelines is visual recognition such as image and video classification. In such applications, having a compact yet rich and informative representation plays a pivotal role. An underlying assumption in traditional coding schemes [e.g., sparse coding (SC)] is that the data geometrically comply with the Euclidean space. In other words, the data are presented to the algorithm in vector form and Euclidean axioms are fulfilled. This is of course restrictive in machine learning, computer vision, and signal processing, as shown by a large number of recent studies. This paper takes a further step and provides a comprehensive mathematical framework to perform coding in curved and non-Euclidean spaces, i.e., Riemannian manifolds. To this end, we start by the simplest form of coding, namely, bag of words. Then, inspired by the success of vector of locally aggregated descriptors in addressing computer vision problems, we will introduce its Riemannian extensions. Finally, we study Riemannian form of SC, locality-constrained linear coding, and collaborative coding. Through rigorous tests, we demonstrate the superior performance of our Riemannian coding schemes against the state-of-the-art methods on several visual classification tasks, including head pose classification, video-based face recognition, and dynamic scene recognition.

引用

页码：5701 / 5712

页数：12

共 44 条

[11] Jensen-Bregman LogDet Divergence with Application to Efficient Similarity Search for Covariance Matrices [J].

Cherian, Anoop ;

Sra, Suvrit ;

Banerjee, Arindam ;

Papanikolopoulos, Nikolaos .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (09) :2161-2174

[12]

Cimpoi M, 2015, PROC CVPR IEEE, P3828, DOI 10.1109/CVPR.2015.7299007

[13] Efficient 3D face recognition using local covariance descriptor and Riemannian kernel sparse coding [J].

Deng, Xing ;

Da, Feipeng ;

Shao, Haijian .

COMPUTERS & ELECTRICAL ENGINEERING, 2017, 62 :81-91

[14] The geometry of algorithms with orthogonality constraints [J].

Edelman, A ;

Arias, TA ;

Smith, ST .

SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1998, 20 (02) :303-353

[15]

Faraj M., 2016, 2016 IEEE Transportation Electrification Conference and Expo (ITEC), P1

[16]

Faraki M, 2015, PROC CVPR IEEE, P4951, DOI 10.1109/CVPR.2015.7299129

[17] Material Classification on Symmetric Positive Definite Manifolds [J].

Faraki, Masoud ;

Harandi, Mehrtash T. ;

Porikli, Fatih .

2015 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2015, :749-756

[18]

Ghanem B., 2010, LECT NOTES COMPUT SC, P223

[19]

GOLUB H., 2013, Matrix Computations

[20]

Gong YC, 2014, LECT NOTES COMPUT SC, V8695, P392, DOI 10.1007/978-3-319-10584-0_26

← 1 2 3 4 5 →