Spatial Enhancement and Temporal Constraint for Weakly Supervised Action Localization

被引:5
作者
Qin, Xiaolei [1 ]
Ge, Yongxin [1 ]
Yu, Hui [2 ]
Chen, Feiyu [1 ]
Yang, Dan [1 ]
机构
[1] Chongqing Univ, Sch Big Data & Software Engn, Chongqing 401331, Peoples R China
[2] Univ Portsmouth, Portsmouth PO1 2DJ, Hants, England
关键词
Training; Proposals; Feature extraction; Two dimensional displays; Entropy; Signal processing; Signal processing algorithms; Weakly supervised temporal action localization; spatial enhancement; instance sparse constraint; confidence connectivity enhancement;
D O I
10.1109/LSP.2020.3018914
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Weakly supervised temporal action localization (WSTAL) is a practical but challenging issue in video understanding. However, most existing methods have to activate background snippets or deactivate action snippets in cases of no boundary annotations, which inevitably affects the localization of action instances. In this letter, we propose a spatial enhancement and temporal constraint (SETC) model to address this problem from three aspects. Specifically, we first propose a spatial enhancement module to enhance the discrimination of the extracted features. Then we leverage the instance sparse constraint to restrain the drastic fluctuation class activation sequence (CAS). Finally, we use the confidence connectivity enhancement to connect the snippets that are broken up by mistake. Experiments on THUMOS'14 and ActivityNet datasets validate the efficacy of SETC against existing state-of-the-art WSTAL algorithms.
引用
收藏
页码:1520 / 1524
页数:5
相关论文
共 19 条
  • [1] Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization
    Alwassel, Humam
    Heilbron, Fabian Caba
    Ghanem, Bernard
    [J]. COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 253 - 269
  • [2] Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
  • [3] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [4] Rethinking the Faster R-CNN Architecture for Temporal Action Localization
    Chao, Yu-Wei
    Vijayanarasimhan, Sudheendra
    Seybold, Bryan
    Ross, David A.
    Deng, Jia
    Sukthankar, Rahul
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1130 - 1139
  • [5] Temporal Context Network for Activity Localization in Videos
    Dai, Xiyang
    Singh, Bharat
    Zhang, Guyue
    Davis, Larry S.
    Chen, Yan Qiu
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5727 - 5736
  • [6] Gao J., 2017, P BRIT MACH VIS C BM
  • [7] What Do I Annotate Next? An Empirical Study of Active Learning for Action Localization
    Heilbron, Fabian Caba
    Lee, Joon-Young
    Jin, Hailin
    Ghanem, Bernard
    [J]. COMPUTER VISION - ECCV 2018, PT XI, 2018, 11215 : 212 - 229
  • [8] The THUMOS challenge on action recognition for videos "in the wild"
    Idrees, Haroon
    Zamir, Amir R.
    Jiang, Yu-Gang
    Gorban, Alex
    Laptev, Ivan
    Sukthankar, Rahul
    Shah, Mubarak
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 155 : 1 - 23
  • [9] BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
    Lin, Tianwei
    Zhao, Xu
    Su, Haisheng
    Wang, Chongjing
    Yang, Ming
    [J]. COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 : 3 - 21
  • [10] Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization
    Liu, Daochang
    Jiang, Tingting
    Wang, Yizhou
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1298 - 1307