Deep cascaded action attention network for weakly-supervised temporal action localization

被引:0
|
作者
Hui-fen Xia
Yong-zhao Zhan
机构
[1] Jiangsu University,School of Computer Science and Communication Engineering
[2] Changzhou Vocational Institute of Mechatronic Technology,undefined
[3] Jiangsu Engineering Research Center of Big Data Ubiquitous Perception and Intelligent Agriculture Applications,undefined
来源
关键词
Weakly-supervised; Temporal action localization; Deep cascaded action attention; Non-action suppression;
D O I
暂无
中图分类号
学科分类号
摘要
Weakly-supervised temporal action localization (W-TAL) is to locate the boundaries of action instances and classify them in an untrimmed video, which is a challenging task due to only video-level labels during training. Existing methods mainly focus on the most discriminative action snippets of a video by using top-k multiple instance learning (MIL), and ignore the usage of less discriminative action snippets and non-action snippets. This makes the localization performance improve limitedly. In order to mine the less discriminative action snippets and distinguish the non-action snippets better in a video, a novel method based on deep cascaded action attention network is proposed. In this method, the deep cascaded action attention mechanism is presented to model not only the most discriminative action snippets, but also different levels of less discriminative action snippets by introducing threshold erasing, which ensures the completeness of action instances. Besides, the entropy loss for non-action is introduced to restrict the activations of non-action snippets for all action categories, which are generated by aggregating the bottom-k activation scores along the temporal dimension. Thereby, the action snippets can be distinguished from non-action snippets better, which is beneficial to the separation of action and non-action snippets and enables the action instances more accurate. Ultimately, our method can facilitate more precise action localization. Extensive experiments conducted on THUMOS14 and ActivityNet1.3 datasets show that our method outperforms state-of-the-art methods at several t-IoU thresholds.
引用
收藏
页码:29769 / 29787
页数:18
相关论文
共 50 条
  • [1] Deep cascaded action attention network for weakly-supervised temporal action localization
    Xia, Hui-fen
    Zhan, Yong-zhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 29769 - 29787
  • [2] Action Coherence Network for Weakly-Supervised Temporal Action Localization
    Zhai, Yuanhao
    Wang, Le
    Tang, Wei
    Zhang, Qilin
    Zheng, Nanning
    Hua, Gang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1857 - 1870
  • [3] A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization
    Islam, Ashraful
    Long, Chengjiang
    Radke, Richard
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1637 - 1645
  • [4] Deep Motion Prior for Weakly-Supervised Temporal Action Localization
    Cao, Meng
    Zhang, Can
    Chen, Long
    Shou, Mike Zheng
    Zou, Yuexian
    IEEE Transactions on Image Processing, 2022, 31 : 5203 - 5213
  • [5] Entropy guided attention network for weakly-supervised action localization
    Cheng, Yi
    Sun, Ying
    Fan, Hehe
    Zhuo, Tao
    Lim, Joo-Hwee
    Kankanhalli, Mohan
    PATTERN RECOGNITION, 2022, 129
  • [6] Deep Motion Prior for Weakly-Supervised Temporal Action Localization
    Cao, Meng
    Zhang, Can
    Chen, Long
    Shou, Mike Zheng
    Zou, Yuexian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5203 - 5213
  • [7] Background Suppression Network for Weakly-Supervised Temporal Action Localization
    Lee, Pilhyeon
    Uh, Youngjung
    Byun, Hyeran
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11320 - 11327
  • [8] Feature Matching Network for Weakly-Supervised Temporal Action Localization
    Dou, Peng
    Zhou, Wei
    Liao, Zhongke
    Hu, Haifeng
    PATTERN RECOGNITION AND COMPUTER VISION, PT IV, 2021, 13022 : 459 - 471
  • [9] ACGNet: Action Complement Graph Network for Weakly-Supervised Temporal Action Localization
    Yang, Zichen
    Qin, Jie
    Huang, Di
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3090 - 3098
  • [10] Weakly-supervised temporal action localization: a survey
    AbdulRahman Baraka
    Mohd Halim Mohd Noor
    Neural Computing and Applications, 2022, 34 : 8479 - 8499