Temporal RPN Learning for Weakly-Supervised Temporal Action Localization

被引:0
作者
Huang, Jing [1 ]
Kong, Ming [2 ,3 ]
Chen, Luyuan [4 ]
Liang, Tian [1 ]
Zhu, Qiang [2 ]
机构
[1] Zhejiang Univ, Hangzhou 310058, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310058, Peoples R China
[3] Hikvis Res Inst, Hangzhou 310051, Peoples R China
[4] Beijing Informat Sci & Technol Univ, Beijing 100101, Peoples R China
来源
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222 | 2023年 / 222卷
关键词
Weakly-Supervised Learning; Action Localization; Temporal Region Proposal;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly-Supervised Temporal Action Localization (WSTAL) aims to train an action instance localization model from untrimmed videos with only video-level labels, similar to the Object Detection (OD) task. Existing Top-k MIL-based WSTAL methods cannot flexibly define the learning space, which limits the model's learning efficiency and performance. Faster R-CNN is a classic two-stage object detection architecture with an efficient Region Proposal Network. This paper successfully migrates the Faster R-CNN liked two-stage architecture to the WSTAL task: first to build a T-RPN and integrate it with the traditional WSTAL framework; and then to propose a pseudo label generation mechanism to enable the T-RPN learning without temporal annotations. Our new framework has achieved breakthrough performances on THUMOS-14 and ActivityNet-v1.2 datasets, and comprehensive ablation experiments have verified the effectiveness of the innovations. Code will be available at: https://github.com/ZJUHJ/TRPN.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Modeling Sub-Actions for Weakly Supervised Temporal Action Localization
    Huang, Linjiang
    Huang, Yan
    Ouyang, Wanli
    Wang, Liang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5154 - 5167
  • [32] Weakly-supervised temporal attention 3D network for human action recognition
    Kim, Jonghyun
    Li, Gen
    Yun, Inyong
    Jung, Cheolkon
    Kim, Joongkyu
    [J]. PATTERN RECOGNITION, 2021, 119
  • [33] Weakly-Supervised Learning for Tool Localization in Laparoscopic Videos
    Vardazaryan, Armine
    Mutter, Didier
    Marescaux, Jacques
    Padoy, Nicolas
    [J]. INTRAVASCULAR IMAGING AND COMPUTER ASSISTED STENTING AND LARGE-SCALE ANNOTATION OF BIOMEDICAL DATA AND EXPERT LABEL SYNTHESIS, 2018, 11043 : 169 - 179
  • [34] Dual Masked Modeling for Weakly-Supervised Temporal Boundary Discovery
    Ma, Yuer
    Liu, Yi
    Wang, Limin
    Kang, Wenxiong
    Qiao, Yu
    Wang, Yali
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5694 - 5704
  • [35] Entropy guided attention network for weakly-supervised action localization
    Cheng, Yi
    Sun, Ying
    Fan, Hehe
    Zhuo, Tao
    Lim, Joo-Hwee
    Kankanhalli, Mohan
    [J]. PATTERN RECOGNITION, 2022, 129
  • [36] Towards better utilization of pseudo labels for weakly supervised temporal action localization
    Tang, Yiping
    Ge, Junyao
    Guo, Kaitai
    Zheng, Yang
    Hu, Haihong
    Liang, Jimin
    [J]. INFORMATION SCIENCES, 2023, 623 : 693 - 708
  • [37] Weakly Supervised Regional and Temporal Learning for Facial Action Unit Recognition
    Yan, Jingwei
    Wang, Jingjing
    Li, Qiang
    Wang, Chunmao
    Pu, Shiliang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1760 - 1772
  • [38] Uncertainty Guided Collaborative Training for Weakly Supervised and Unsupervised Temporal Action Localization
    Yang, Wenfei
    Zhang, Tianzhu
    Zhang, Yongdong
    Wu, Feng
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 5252 - 5267
  • [39] Deep Learning for Weakly-Supervised Object Detection and Localization: A Survey
    Shao, Feifei
    Chen, Long
    Shao, Jian
    Ji, Wei
    Xiao, Shaoning
    Ye, Lu
    Zhuang, Yueting
    Xiao, Jun
    [J]. NEUROCOMPUTING, 2022, 496 : 192 - 207
  • [40] Weakly-Supervised Temporal Action Alignment Driven by Unbalanced Spectral Fused Gromov-Wasserstein Distance
    Luo, Dixin
    Wang, Yutong
    Yue, Angxiao
    Xu, Hongteng
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022,