Deep hybrid manifold for image set classification

被引：0

作者：

Zeng, Xianhua ^{[1
,2
]}

Guo, Jueqiu ^{[1
]}

Wei, Yifan ^{[1
]}

Zhuo, Yang ^{[1
]}

机构：

[1] Chongqing Univ Posts & Telecommun, Sch Comp Sci & Technol, Sch Artificial Intelligence, Chongqing 400065, Peoples R China

[2] U Posts & Telecommun, Sch Comp Sci & Technol, Sch Artificial Intelligence, Chongqing 400065, Peoples R China

来源：

IMAGE AND VISION COMPUTING | 2024年 / 143卷

基金：

中国国家自然科学基金;

关键词：

SPD manifold; Grassmann manifold; Visual classification; Hybrid manifold; Neural network; GEOMETRY;

D O I：

10.1016/j.imavis.2024.104935

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The exponential growth of the data volume of image sets, which contain more information than a single image, has attracted increasing attention from researchers. Image set data are often described as covariance matrices or linear subspaces, and the unique geometries they span are symmetric positive definite (SPD) manifolds and Grassmann manifolds, respectively. Image set data are often described as covariance matrices or linear subspaces, and the distinctive geometries they span are symmetric positive definite (SPD) manifold and Grassmann manifold, respectively. However, most studies focus on a single manifold and ignore the useful information of the another manifold. Based on this, we propose a new Deep Hybrid Manifold Network (DHMNet). The DHMNet consists of backbone network, stackable Hybrid Manifold AutoEncoder (HMAE) and,Maximum Fusion Module (MFM). The image set data is modeled through SPD manifold and Grassmann manifold. The modeled data is input into the backbone network composed of SPDNet and GrNet for initial feature extraction, and the output manifold data are input into HMAEs. The HMAE effectively extracts and hybridizes complementary information from different manifolds and has the ability to generate deep representations with rich structural semantic information. For the three image datasets used, DHMNet with two HMAEs improves the classification accuracy by 3.83-5.76% over the classical SPDNet, and even reaches the best when compared to other models, with the best performance on the First Person Hand Action (FPHA) dataset for skeleton -based hand action recognition.

引用

页数：15

共 39 条

[11] First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations [J].

Garcia-Hernando, Guillermo ;

Yuan, Shanxin ;

Baek, Seungryul ;

Kim, Tae-Kyun .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :409-419

[12] Dimensionality Reduction on SPD Manifolds: The Emergence of Geometry-Aware Methods [J].

Harandi, Mehrtash ;

Salzmann, Mathieu ;

Hartley, Richard .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (01) :48-62

[13]

Harandi M, 2017, PR MACH LEARN RES, V70

[14]

Harandi MT, 2012, LECT NOTES COMPUT SC, V7573, P216, DOI 10.1007/978-3-642-33709-3_16

[15]

Hu JF, 2015, PROC CVPR IEEE, P5344, DOI 10.1109/CVPR.2015.7299172

[16]

Huang ZW, 2018, AAAI CONF ARTIF INTE, P3279

[17]

Huang ZW, 2017, AAAI CONF ARTIF INTE, P2036

[18]

Huang ZW, 2015, PR MACH LEARN RES, V37, P720

[19]

Huang ZW, 2015, PROC CVPR IEEE, P140, DOI 10.1109/CVPR.2015.7298609

[20] Face recognition on large-scale video in the wild with hybrid Euclidean-and-Riemannian metric learning [J].

Huang, Zhiwu ;

Wang, Ruiping ;

Shan, Shiguang ;

Chen, Xilin .

PATTERN RECOGNITION, 2015, 48 (10) :3113-3124

← 1 2 3 4 →