Improving Action Recognition via Temporal and Complementary Learning

被引:6
作者
Elmadany, Nour Eldin [1 ,2 ]
He, Yifeng [3 ,4 ]
Guan, Ling [1 ]
机构
[1] Ryerson Univ, Dept Elect Comp & Biomed Engn, 350 Victoria St, Toronto, ON M5B 2K3, Canada
[2] Vector Inst, Toronto, ON, Canada
[3] Ryerson Univ, Toronto, ON, Canada
[4] 117 Micmac Cres, N York, ON M2H 2K1, Canada
关键词
Deep ConvNets; two-stream networks; HISTOGRAMS; FLOW;
D O I
10.1145/3447686
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we study the problem of video-based action recognition. We improve the action recognition performance by finding an effective temporal and appearance representation. For capturing the temporal representation, we introduce two temporal learning techniques for improving long-term temporal information modeling, specifically Temporal Relational Network and Temporal Second-Order Pooling-based Network. Moreover, we harness the representation using complementary learning techniques, specifically Global-Local Network and Fuse-Inception Network. Performance evaluation on three datasets (UCF101, HMDB-51, and Mini-Kinetics-200) demonstrated the superiority of the proposed framework compared to the 2D Deep ConvNets-based state-of-the-art techniques.
引用
收藏
页数:24
相关论文
共 80 条
  • [1] [Anonymous], 2012, DATASET 101 HUMAN AC
  • [2] [Anonymous], 2012, 2012 IEEE COMPUTER S
  • [3] Aranki D., 2014, Proceedings of the 9th International Conference on Body Area Networks, P135
  • [4] Beaudet P. R., 1978, Proceedings of the 4th International Joint Conference on Pattern Recognition, P579
  • [5] Dynamic Image Networks for Action Recognition
    Bilen, Hakan
    Fernando, Basura
    Gavves, Efstratios
    Vedaldi, Andrea
    Gould, Stephen
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3034 - 3042
  • [6] Blank M, 2005, IEEE I CONF COMP VIS, P1395
  • [7] The recognition of human movement using temporal templates
    Bobick, AF
    Davis, JW
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) : 257 - 267
  • [8] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [9] Chaudhry R, 2009, PROC CVPR IEEE, P1932, DOI 10.1109/CVPRW.2009.5206821
  • [10] Multi-fiber Networks for Video Recognition
    Chen, Yunpeng
    Kalantidis, Yannis
    Li, Jianshu
    Yan, Shuicheng
    Feng, Jiashi
    [J]. COMPUTER VISION - ECCV 2018, PT I, 2018, 11205 : 364 - 380