Improving Action Recognition via Temporal and Complementary Learning

被引:6
作者
Elmadany, Nour Eldin [1 ,2 ]
He, Yifeng [3 ,4 ]
Guan, Ling [1 ]
机构
[1] Ryerson Univ, Dept Elect Comp & Biomed Engn, 350 Victoria St, Toronto, ON M5B 2K3, Canada
[2] Vector Inst, Toronto, ON, Canada
[3] Ryerson Univ, Toronto, ON, Canada
[4] 117 Micmac Cres, N York, ON M2H 2K1, Canada
关键词
Deep ConvNets; two-stream networks; HISTOGRAMS; FLOW;
D O I
10.1145/3447686
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we study the problem of video-based action recognition. We improve the action recognition performance by finding an effective temporal and appearance representation. For capturing the temporal representation, we introduce two temporal learning techniques for improving long-term temporal information modeling, specifically Temporal Relational Network and Temporal Second-Order Pooling-based Network. Moreover, we harness the representation using complementary learning techniques, specifically Global-Local Network and Fuse-Inception Network. Performance evaluation on three datasets (UCF101, HMDB-51, and Mini-Kinetics-200) demonstrated the superiority of the proposed framework compared to the 2D Deep ConvNets-based state-of-the-art techniques.
引用
收藏
页数:24
相关论文
共 80 条
  • [31] Better exploiting motion for better action recognition
    Jain, Mihir
    Jegou, Herve
    Bouthemy, Patrick
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 2555 - 2562
  • [32] Jégou H, 2010, PROC CVPR IEEE, P3304, DOI 10.1109/CVPR.2010.5540039
  • [33] Large-scale Video Classification with Convolutional Neural Networks
    Karpathy, Andrej
    Toderici, George
    Shetty, Sanketh
    Leung, Thomas
    Sukthankar, Rahul
    Fei-Fei, Li
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1725 - 1732
  • [34] Klaser A., 2015, P BRIT MACH VIS C BM
  • [35] Kuehne H, 2011, IEEE I CONF COMP VIS, P2556, DOI 10.1109/ICCV.2011.6126543
  • [36] Lafferty J., 2001, P 18 INT C MACH LEAR, P282, DOI [10.1038/nprot.2006.61, DOI 10.1038/NPROT.2006.61, DOI 10.5555/645530.655813]
  • [37] On space-time interest points
    Laptev, I
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2005, 64 (2-3) : 107 - 123
  • [38] Laptev I, 2006, LECT NOTES COMPUT SC, V3667, P91
  • [39] Recurrent Tubelet Proposal and Recognition Networks for Action Detection
    Li, Dong
    Qiu, Zhaofan
    Dai, Qi
    Yao, Ting
    Mei, Tao
    [J]. COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 : 306 - 322
  • [40] Li YH, 2019, AAAI CONF ARTIF INTE, P8674