Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition

被引：17

作者：

Wang, Yancheng ^{[1
]}

Xiao, Yang ^{[1
]}

Lu, Junyi ^{[1
]}

Tan, Bo ^{[1
]}

Cao, Zhiguo ^{[1
]}

Zhang, Zhenjun ^{[2
]}

Zhou, Joey Tianyi ^{[3
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China

[2] Hunan Univ, Coll Elect & Informat Engn, Natl Engn Lab Robot Visual Percept & Control Tech, Changsha 410082, Peoples R China

[3] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2022年 / 33卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Visualization; Feature extraction; Encoding; Skeleton; Task analysis; Image recognition; Image coding; Cross-view 3-D action recognition; discriminative viewpoint instance discovery; Fisher vector (FV); multi-view dynamic image (MVDI); viewpoint aggregation; JOINTS;

D O I：

10.1109/TNNLS.2021.3070179

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Dramatic imaging viewpoint variation is the critical challenge toward action recognition for depth video. To address this, one feasible way is to enhance view-tolerance of visual feature, while still maintaining strong discriminative capacity. Multi-view dynamic image (MVDI) is the most recently proposed 3-D action representation manner that is able to compactly encode human motion information and 3-D visual clue well. However, it is still view-sensitive. To leverage its performance, a discriminative MVDI fusion method is proposed by us via multi-instance learning (MIL). Specifically, the dynamic images (DIs) from different observation viewpoints are regarded as the instances for 3-D action characterization. After being encoded using Fisher vector (FV), they are then aggregated by sum-pooling to yield the representative 3-D action signature. Our insight is that viewpoint aggregation helps to enhance view-tolerance. And, FV can map the raw DI feature to the higher dimensional feature space to promote the discriminative power. Meanwhile, a discriminative viewpoint instance discovery method is also proposed to discard the viewpoint instances unfavorable for action characterization. The wide-range experiments on five data sets demonstrate that our proposition can significantly enhance the performance of cross-view 3-D action recognition. And, it is also applicable to cross-view 3-D object recognition. The source code is available at https://github.com/3huo/ActionView.

引用

页码：5332 / 5345

页数：14

共 50 条

[1] Cross-View Fusion for Multi-View Clustering
Huang, Zhijie
Huang, Binqiang
Zheng, Qinghai
Yu, Yuanlong
IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 621 - 625
[2] Multi-View Gait Image Generation for Cross-View Gait Recognition
Chen, Xin
Luo, Xizhao
Weng, Jian
Luo, Weiqi
Li, Huiting
Tian, Qi
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3041 - 3055
[3] Mining Discriminative 3D Poselet for Cross-view Action Recognition
Wang, Jiang
Nie, Xiaohan
Xia, Yin
Wu, Ying
2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 634 - 639
[4] Discriminative Virtual Views for Cross-View Action Recognition
Li, Ruonan
Zickler, Todd
2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2855 - 2862
[5] Bidirectional Fusion With Cross-View Graph Filter for Multi-View Clustering
Yang, Xiaojun
Zhu, Tuoji
Wu, Danyang
Wang, Penglei
Liu, Yujia
Nie, Feiping
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (11) : 5675 - 5680
[6] Cross-view Transformer for enhanced multi-view 3D reconstruction
Shi, Wuzhen
Yin, Aixue
Li, Yingxiang
Qian, Bo
VISUAL COMPUTER, 2024,
[7] 3-D Dynamic Multitarget Detection Algorithm Based on Cross-View Feature Fusion
Zhou F.
Tao C.
Gao Z.
Zhang Z.
Zheng S.
Zhu Y.
IEEE Transactions on Artificial Intelligence, 2024, 5 (06): : 3146 - 3159
[8] Pairwise-Covariance Multi-view Discriminant Analysis for Robust Cross-View Human Action Recognition
Tran, Hoang-Nhat
Nguyen, Hong-Quan
Doan, Huong-Giang
Tran, Thanh-Hai
Le, Thi-Lan
Vu, Hai
IEEE ACCESS, 2021, 9 : 76097 - 76111
[9] Multi-view Deep Network for Cross-view Classification
Kan, Meina
Shan, Shiguang
Chen, Xilin
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4847 - 4855
[10] Multi-View Latent Variable Discriminative Models For Action Recognition
Song, Yale
Morency, Louis-Philippe
Davis, Randall
2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2120 - 2127

← 1 2 3 4 5 →