Multi-scale Dynamic Network for Temporal Action Detection

被引:2
作者
Ren, Yifan [1 ,2 ]
Xu, Xing [1 ,2 ]
Shen, Fumin [1 ,2 ]
Wang, Zheng [1 ,2 ]
Yang, Yang [1 ,2 ]
Shen, Heng Tao [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Ctr Future Media, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
来源
PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21) | 2021年
基金
中国国家自然科学基金;
关键词
Temporal Action Detection; Dynamic Filters; Multi-scale Features;
D O I
10.1145/3460426.3463613
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, as the fundamental task in video understanding, Temporal Action Detection is attracting extensive attention. Most existing approaches use the same model parameters to process all input videos, which are not adaptive to the input video during the inference stage. In this paper, we propose a novel model termed Multi-scale Dynamic Network (MDN) to tackle this problem. The proposed MDN model incorporates multiple Multi-scale Dynamic Modules (MDMs). Each MDM can generate video-specific and segment-specific convolution kernels based on video content from different scales and adaptively capture rich semantic information for the prediction. Besides, we also design a new Edge Suppression Loss (ESL) function for MDN to pay more attention to hard examples. Extensive experiments conducted on two popular benchmarks ActivityNet-1.3 and THUMOS-14 show that the proposed MDN model achieves the state-of-the-art performance.
引用
收藏
页码:267 / 275
页数:9
相关论文
共 50 条
  • [31] SMC: Single-Stage Multi-location Convolutional Network for Temporal Action Detection
    Liu, Zhikang
    Wang, Zilei
    Zhao, Yan
    Tian, Ye
    [J]. COMPUTER VISION - ACCV 2018, PT II, 2019, 11362 : 179 - 195
  • [32] Non-Local Temporal Difference Network for Temporal Action Detection
    He, Yilong
    Han, Xiao
    Zhong, Yong
    Wang, Lishun
    [J]. SENSORS, 2022, 22 (21)
  • [33] Temporal-visual proposal graph network for temporal action detection
    Ming-Gang Gan
    Yan Zhang
    Shaowen Su
    [J]. Applied Intelligence, 2023, 53 : 26008 - 26026
  • [34] M2-CDNET: A MULTI-SCALE AND MULTI-LEVEL NETWORK FOR REMOTE SENSING IMAGE CHANGE DETECTION
    Wu, Qiong
    Zheng, Zhi
    Wan, Yi
    Zhang, Yongjun
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5507 - 5510
  • [35] Boundary graph convolutional network for temporal action detection
    Chen, Yaosen
    Guo, Bing
    Shen, Yan
    Wang, Wei
    Lu, Weichen
    Suo, Xinhua
    [J]. IMAGE AND VISION COMPUTING, 2021, 109
  • [36] MSIANet: Multi-scale Interactive Attention Crowd Counting Network
    Zhang, Shihui
    Zhao, Weibo
    Wang, Lei
    Wang, Wei
    Li, Qunpeng
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (06) : 2236 - 2245
  • [37] Multi-Scale Frequency Enhancement Network for Blind Image Deblurring
    Xiang, Yawen
    Zhou, Heng
    Zhang, Xi
    Li, Chengyang
    Li, Zhongbo
    Xie, Yongqiang
    [J]. IET IMAGE PROCESSING, 2025, 19 (01)
  • [38] Lightweight Multi-Scale Feature Fusion Network for Salient Object Detection in Optical Remote Sensing Images
    Li, Jun
    Huang, Kaigen
    [J]. ELECTRONICS, 2025, 14 (01):
  • [39] Short-term power load forecasting based on spatial-temporal dynamic graph and multi-scale Transformer
    Zhu, Li
    Gao, Jingkai
    Zhu, Chunqiang
    Deng, Fan
    [J]. JOURNAL OF COMPUTATIONAL DESIGN AND ENGINEERING, 2025, 12 (02) : 92 - 111
  • [40] Temporal Relation-Aware Global Attention Network for Temporal Action Detection
    Xu, Weijie
    Tan, Jingwei
    Wang, Shulin
    Yang, Sheng
    [J]. ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT I, ICIC 2024, 2024, 14875 : 257 - 269