Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition

被引:17
|
作者
Wang, Yancheng [1 ]
Xiao, Yang [1 ]
Lu, Junyi [1 ]
Tan, Bo [1 ]
Cao, Zhiguo [1 ]
Zhang, Zhenjun [2 ]
Zhou, Joey Tianyi [3 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China
[2] Hunan Univ, Coll Elect & Informat Engn, Natl Engn Lab Robot Visual Percept & Control Tech, Changsha 410082, Peoples R China
[3] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
基金
中国国家自然科学基金;
关键词
Visualization; Feature extraction; Encoding; Skeleton; Task analysis; Image recognition; Image coding; Cross-view 3-D action recognition; discriminative viewpoint instance discovery; Fisher vector (FV); multi-view dynamic image (MVDI); viewpoint aggregation; JOINTS;
D O I
10.1109/TNNLS.2021.3070179
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dramatic imaging viewpoint variation is the critical challenge toward action recognition for depth video. To address this, one feasible way is to enhance view-tolerance of visual feature, while still maintaining strong discriminative capacity. Multi-view dynamic image (MVDI) is the most recently proposed 3-D action representation manner that is able to compactly encode human motion information and 3-D visual clue well. However, it is still view-sensitive. To leverage its performance, a discriminative MVDI fusion method is proposed by us via multi-instance learning (MIL). Specifically, the dynamic images (DIs) from different observation viewpoints are regarded as the instances for 3-D action characterization. After being encoded using Fisher vector (FV), they are then aggregated by sum-pooling to yield the representative 3-D action signature. Our insight is that viewpoint aggregation helps to enhance view-tolerance. And, FV can map the raw DI feature to the higher dimensional feature space to promote the discriminative power. Meanwhile, a discriminative viewpoint instance discovery method is also proposed to discard the viewpoint instances unfavorable for action characterization. The wide-range experiments on five data sets demonstrate that our proposition can significantly enhance the performance of cross-view 3-D action recognition. And, it is also applicable to cross-view 3-D object recognition. The source code is available at https://github.com/3huo/ActionView.
引用
收藏
页码:5332 / 5345
页数:14
相关论文
共 50 条
  • [1] Cross-View Fusion for Multi-View Clustering
    Huang, Zhijie
    Huang, Binqiang
    Zheng, Qinghai
    Yu, Yuanlong
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 621 - 625
  • [2] Multi-View Gait Image Generation for Cross-View Gait Recognition
    Chen, Xin
    Luo, Xizhao
    Weng, Jian
    Luo, Weiqi
    Li, Huiting
    Tian, Qi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3041 - 3055
  • [3] Mining Discriminative 3D Poselet for Cross-view Action Recognition
    Wang, Jiang
    Nie, Xiaohan
    Xia, Yin
    Wu, Ying
    2014 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2014, : 634 - 639
  • [4] Discriminative Virtual Views for Cross-View Action Recognition
    Li, Ruonan
    Zickler, Todd
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2855 - 2862
  • [5] Bidirectional Fusion With Cross-View Graph Filter for Multi-View Clustering
    Yang, Xiaojun
    Zhu, Tuoji
    Wu, Danyang
    Wang, Penglei
    Liu, Yujia
    Nie, Feiping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (11) : 5675 - 5680
  • [6] Cross-view Transformer for enhanced multi-view 3D reconstruction
    Shi, Wuzhen
    Yin, Aixue
    Li, Yingxiang
    Qian, Bo
    VISUAL COMPUTER, 2024,
  • [7] 3-D Dynamic Multitarget Detection Algorithm Based on Cross-View Feature Fusion
    Zhou F.
    Tao C.
    Gao Z.
    Zhang Z.
    Zheng S.
    Zhu Y.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (06): : 3146 - 3159
  • [8] Pairwise-Covariance Multi-view Discriminant Analysis for Robust Cross-View Human Action Recognition
    Tran, Hoang-Nhat
    Nguyen, Hong-Quan
    Doan, Huong-Giang
    Tran, Thanh-Hai
    Le, Thi-Lan
    Vu, Hai
    IEEE ACCESS, 2021, 9 : 76097 - 76111
  • [9] Multi-view Deep Network for Cross-view Classification
    Kan, Meina
    Shan, Shiguang
    Chen, Xilin
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4847 - 4855
  • [10] Multi-View Latent Variable Discriminative Models For Action Recognition
    Song, Yale
    Morency, Louis-Philippe
    Davis, Randall
    2012 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2012, : 2120 - 2127