Detecting action tubes via spatial action estimation and temporal path inference

被引:4
|
作者
Li, Nannan [1 ]
Huang, Jingjia [1 ]
Li, Thomas [2 ]
Guo, Huiwen [3 ]
Li, Ge [1 ]
机构
[1] Peking Univ, Shenzhen Grad Sch, Sch Elect & Comp Engn, Beijing, Peoples R China
[2] Gpower Semicond Inc, Suzhou, Peoples R China
[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Action detection; Spatial localization; Region proposal network; Tracking-by-detection; SUM-PRODUCT NETWORKS; ACTION RECOGNITION;
D O I
10.1016/j.neucom.2018.05.033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the problem of action detection in unconstrained video clips. Our approach starts from action detection on object proposals at each frame, then aggregates the frame-level detection results belonging to the same actor across the whole video via linking, associating, and tracking to generate action tubes that are spatially compact and temporally continuous. To achieve the target, a novel action detection model with two-stream architecture is firstly proposed, which utilizes the fused feature from both appearance and motion cues and can be trained end-to-end. Then, the association of the action paths is formulated as a maximum set coverage problem with the results of action detection as a priori. We utilize an incremental search algorithm to obtain all the action proposals at one-pass operation with great efficiency, especially while dealing with the video of long duration or with multiple action instances. Finally, a tracking-by-detection scheme is designed to further refine the generated action paths. Extensive experiments on three validation datasets, UCF-Sports, UCF-101 and J-HMDB, show that the proposed approach advances state-of-the-art action detection performance in terms of both accuracy and proposal quality. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:65 / 77
页数:13
相关论文
共 50 条
  • [41] Accelerating temporal action proposal generation via high performance computing
    Tian WANG
    Shiye LEI
    Youyou JIANG
    Choi CHANG
    Hichem SNOUSSI
    Guangcun SHAN
    Yao FU
    Frontiers of Computer Science, 2022, 16 (04) : 61 - 70
  • [42] A Novel Action Recognition Scheme Based on Spatial-Temporal Pyramid Model
    Zhao, Hengying
    Xiang, Xinguang
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT II, 2018, 10736 : 212 - 221
  • [43] Spatial-temporal channel-wise attention network for action recognition
    Chen, Lin
    Liu, Yungang
    Man, Yongchao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (14) : 21789 - 21808
  • [44] MULTI-STREAM SINGLE SHOT SPATIAL-TEMPORAL ACTION DETECTION
    Zhang, Pengfei
    Cao, Yu
    Liu, Benyuan
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3691 - 3695
  • [45] Recurrent attention network using spatial-temporal relations for action recognition
    Zhang, Mingxing
    Yang, Yang
    Ji, Yanli
    Xie, Ning
    Shen, Fumin
    SIGNAL PROCESSING, 2018, 145 : 137 - 145
  • [46] STAP: Spatial-Temporal Attention-Aware Pooling for Action Recognition
    Nguyen, Tam V.
    Song, Zheng
    Yan, Shuicheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2015, 25 (01) : 77 - 86
  • [47] Spatial-temporal pyramid based Convolutional Neural Network for action recognition
    Zheng, Zhenxing
    An, Gaoyun
    Wu, Dapeng
    Ruan, Qiuqi
    NEUROCOMPUTING, 2019, 358 : 446 - 455
  • [48] Spatial-temporal channel-wise attention network for action recognition
    Lin Chen
    Yungang Liu
    Yongchao Man
    Multimedia Tools and Applications, 2021, 80 : 21789 - 21808
  • [49] Squeeze-and-Excitation on Spatial and Temporal Deep Feature Space for Action Recognition
    An, Gaoyun
    Zhou, Wen
    Wu, Yuxuan
    Zheng, Zhenxing
    Liu, Yongwen
    PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 648 - 653
  • [50] Spatial-Temporal Exclusive Capsule Network for Open Set Action Recognition
    Feng, Yangbo
    Gao, Junyu
    Yang, Shicai
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 9464 - 9478