Cross-view action recognition understanding from exocentric to egocentric perspective

被引:0
|
作者
Truong, Thanh-Dat [1 ]
Luu, Khoa [1 ]
机构
[1] Univ Arkansas, Comp Vis & Image Understanding Lab, Fayetteville, AR 72701 USA
基金
美国国家科学基金会;
关键词
Cross-view action recognition; Self-attention; Egocentric action recognition; ATTENTION;
D O I
10.1016/j.neucom.2024.128731
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding action recognition in egocentric videos has emerged as a vital research topic with numerous practical applications. With the limitation in the scale of egocentric data collection, learning robust learning-based action recognition models remains difficult. Transferring knowledge learned from the scale exocentric data to the egocentric data is challenging due to the difference in videos across views. work introduces a novel cross-view learning approach to action recognition (CVAR) that effectively transfers knowledge from the exocentric to the selfish view. First, we present a novel geometric-based constraint the self-attention mechanism in Transformer based on analyzing the camera positions between two Then, we propose a new cross-view self-attention loss learned on unpaired cross-view data to enforce the attention mechanism learning to transfer knowledge across views. Finally, to further improve the performance of our cross-view learning approach, we present the metrics to measure the correlations in videos and attention maps effectively. Experimental results on standard egocentric action recognition benchmarks, i.e., Charades Ego, EPIC-Kitchens-55, and EPIC-Kitchens-100, have shown our approach's effectiveness and state-of-the-art performance.
引用
收藏
页数:11
相关论文
共 14 条
  • [1] Cross-View Generalisation in Action Recognition: Feature Design for Transitioning from Exocentric To Egocentric Views
    Rocha, Bernardo
    Moreno, Plinio
    Bernardino, Alexandre
    ROBOT 2023: SIXTH IBERIAN ROBOTICS CONFERENCE ADVANCES IN ROBOTICS, VOL 1, 2024, 976 : 155 - 166
  • [2] Heterogeneous discriminant analysis for cross-view action recognition
    Sui, Wanchen
    Wu, Xinxiao
    Feng, Yang
    Jia, Yunde
    NEUROCOMPUTING, 2016, 191 : 286 - 295
  • [3] Heterogeneous Discriminant Analysis for Cross-View Action Recognition
    Sui, Wanchen
    Wu, Xinxiao
    Feng, Yang
    Liang, Wei
    Jia, Yunde
    NEURAL INFORMATION PROCESSING, ICONIP 2015, PT IV, 2015, 9492 : 566 - 573
  • [4] Cross-view action recognition with small-scale datasets
    Goyal, Gaurvi
    Noceti, Nicoletta
    Odone, Francesca
    IMAGE AND VISION COMPUTING, 2022, 120
  • [5] Cross-View Action Recognition Based on a Statistical Translation Framework
    Wang, Jing
    Zheng, Huicheng
    Gao, Jinyu
    Cen, Jiepeng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2016, 26 (08) : 1461 - 1475
  • [6] Multi-layer representation for cross-view action recognition
    Liu, Zhigang
    Wu, Yin
    Yin, Ziyang
    INFORMATION SCIENCES, 2024, 659
  • [7] Cross-View Action Recognition Over Heterogeneous Feature Spaces
    Wu, Xinxiao
    Wang, Han
    Liu, Cuiwei
    Jia, Yunde
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (11) : 4096 - 4108
  • [8] Learning Representations From Skeletal Self-Similarities for Cross-View Action Recognition
    Shao, Zhanpeng
    Li, Youfu
    Zhang, Hong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (01) : 160 - 174
  • [9] Multiple Continuous Virtual Paths Based Cross-View Action Recognition
    Zhang, Zhong
    Liu, Shuang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2016, 30 (05)
  • [10] Cross-View Action Recognition Using Contextual Maximum Margin Clustering
    Zhang, Zhong
    Wang, Chunheng
    Xiao, Baihua
    Zhou, Wen
    Liu, Shuang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2014, 24 (10) : 1663 - 1668