Deep cascaded action attention network for weakly-supervised temporal action localization

被引:0
|
作者
Hui-fen Xia
Yong-zhao Zhan
机构
[1] Jiangsu University,School of Computer Science and Communication Engineering
[2] Changzhou Vocational Institute of Mechatronic Technology,undefined
[3] Jiangsu Engineering Research Center of Big Data Ubiquitous Perception and Intelligent Agriculture Applications,undefined
来源
Multimedia Tools and Applications | 2023年 / 82卷
关键词
Weakly-supervised; Temporal action localization; Deep cascaded action attention; Non-action suppression;
D O I
暂无
中图分类号
学科分类号
摘要
Weakly-supervised temporal action localization (W-TAL) is to locate the boundaries of action instances and classify them in an untrimmed video, which is a challenging task due to only video-level labels during training. Existing methods mainly focus on the most discriminative action snippets of a video by using top-k multiple instance learning (MIL), and ignore the usage of less discriminative action snippets and non-action snippets. This makes the localization performance improve limitedly. In order to mine the less discriminative action snippets and distinguish the non-action snippets better in a video, a novel method based on deep cascaded action attention network is proposed. In this method, the deep cascaded action attention mechanism is presented to model not only the most discriminative action snippets, but also different levels of less discriminative action snippets by introducing threshold erasing, which ensures the completeness of action instances. Besides, the entropy loss for non-action is introduced to restrict the activations of non-action snippets for all action categories, which are generated by aggregating the bottom-k activation scores along the temporal dimension. Thereby, the action snippets can be distinguished from non-action snippets better, which is beneficial to the separation of action and non-action snippets and enables the action instances more accurate. Ultimately, our method can facilitate more precise action localization. Extensive experiments conducted on THUMOS14 and ActivityNet1.3 datasets show that our method outperforms state-of-the-art methods at several t-IoU thresholds.
引用
收藏
页码:29769 / 29787
页数:18
相关论文
共 50 条
  • [31] Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization
    Fu, Jie
    Gao, Junyu
    Xu, Changsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12427 - 12443
  • [32] Weakly-Supervised Temporal Action Localization with Multi-Head Cross-Modal Attention
    Ren, Hao
    Ren, Haoran
    Ran, Wu
    Lu, Hong
    Jin, Cheng
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 281 - 295
  • [33] Adversarial Seeded Sequence Growing for Weakly-Supervised Temporal Action Localization
    Zhang, Chengwei
    Xu, Yunlu
    Cheng, Zhanzhan
    Niu, Yi
    Pu, Shiliang
    Wu, Fei
    Zou, Futai
    PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 738 - 746
  • [34] Diffusion-based framework for weakly-supervised temporal action localization
    Zou, Yuanbing
    Zhao, Qingjie
    Sarker, Prodip Kumar
    Li, Shanshan
    Wang, Lei
    Liu, Wangwang
    Pattern Recognition, 2025, 160
  • [35] Unleashing the Potential of Adjacent Snippets for Weakly-supervised Temporal Action Localization
    Liu, Qinying
    Wang, Zilei
    Chen, Ruoxi
    Li, Zhilin
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1032 - 1037
  • [36] TSCANet: a two-stream context aggregation network for weakly-supervised temporal action localization
    Zhang, Haiping
    Lin, Haixiang
    Wang, Dongjing
    Xu, Dongyang
    Zhou, Fuxing
    Guan, Liming
    Yu, Dongjing
    Fang, Xujian
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01)
  • [37] Temporal Dropout for Weakly Supervised Action Localization
    Xie, Chi
    Zhuang, Zikun
    Zhao, Shengjie
    Liang, Shuang
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (03)
  • [38] Enhancing action discrimination via category-specific frame clustering for weakly-supervised temporal action localization
    Xia, Huifen
    Zhan, Yongzhao
    Liu, Honglin
    Ren, Xiaopeng
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2024, 25 (06) : 809 - 823
  • [39] Multi-Hierarchical Category Supervision for Weakly-Supervised Temporal Action Localization
    Li, Guozhang
    Li, Jie
    Wang, Nannan
    Ding, Xinpeng
    Li, Zhifeng
    Gao, Xinbo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 9332 - 9344
  • [40] Action-Semantic Consistent Knowledge for Weakly-Supervised Action Localization
    Wang, Yu
    Zhao, Shengjie
    Chen, Shiwei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10279 - 10289