Multi-scale Dynamic Network for Temporal Action Detection

被引:2
|
作者
Ren, Yifan [1 ,2 ]
Xu, Xing [1 ,2 ]
Shen, Fumin [1 ,2 ]
Wang, Zheng [1 ,2 ]
Yang, Yang [1 ,2 ]
Shen, Heng Tao [1 ,2 ]
机构
[1] Univ Elect Sci & Technol China, Ctr Future Media, Chengdu, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu, Peoples R China
来源
PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21) | 2021年
基金
中国国家自然科学基金;
关键词
Temporal Action Detection; Dynamic Filters; Multi-scale Features;
D O I
10.1145/3460426.3463613
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, as the fundamental task in video understanding, Temporal Action Detection is attracting extensive attention. Most existing approaches use the same model parameters to process all input videos, which are not adaptive to the input video during the inference stage. In this paper, we propose a novel model termed Multi-scale Dynamic Network (MDN) to tackle this problem. The proposed MDN model incorporates multiple Multi-scale Dynamic Modules (MDMs). Each MDM can generate video-specific and segment-specific convolution kernels based on video content from different scales and adaptively capture rich semantic information for the prediction. Besides, we also design a new Edge Suppression Loss (ESL) function for MDN to pay more attention to hard examples. Extensive experiments conducted on two popular benchmarks ActivityNet-1.3 and THUMOS-14 show that the proposed MDN model achieves the state-of-the-art performance.
引用
收藏
页码:267 / 275
页数:9
相关论文
共 50 条
  • [1] Multi-Scale Proposal Regression Network for Temporal Action Proposal Generation
    Zheng, Jingye
    Chen, Dihu
    Hu, Haifeng
    IEEE ACCESS, 2019, 7 : 183860 - 183868
  • [2] Truncated attention-aware proposal networks with multi-scale dilation for temporal action detection
    Li, Ping
    Cao, Jiachen
    Yuan, Li
    Ye, Qinghao
    Xu, Xianghua
    PATTERN RECOGNITION, 2023, 142
  • [3] M 3 Net : Movement Enhancement with Multi-Relation toward Multi-Scale video representation for Temporal Action Detection
    Zhao, Zixuan
    Wang, Dongqi
    Zhao, Xu
    PATTERN RECOGNITION, 2024, 155
  • [4] Multi-scale Graph Convolutional Network for understanding human action in videos
    Wang, Houlin
    Zhang, Shihui
    Tian, Qing
    Wang, Lei
    Luo, Bingchun
    Han, Xueqiang
    ADVANCED ENGINEERING INFORMATICS, 2025, 65
  • [5] MULTI-SCALE SPATIAL-TEMPORAL NETWORK FOR PERSON RE-IDENTIFICATION
    Wang, Zhikang
    He, Lihuo
    Gao, Xinbo
    Huang, Yuanfei
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2052 - 2056
  • [6] Dynamic multi-scale feature augmentation for inductive network representation learning
    Cui, Shicheng
    Li, Deqiang
    Zhang, Jing
    PATTERN RECOGNITION, 2025, 161
  • [7] Pyramid attention object detection network with multi-scale feature fusion
    Chen, Xiu
    Li, Yujie
    Nakatoh, Yoshihisa
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 104
  • [8] Construction Vehicle Detection Method Based on Multi-Scale Residual Network
    Liu, Liangshuai
    Chen, Ze
    She, Kai
    Ji, Yanpeng
    Feng, Haiyan
    Ni, Yong
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 1399 - 1405
  • [9] Multi-scale Prototypical Network for Few-shot Anomaly Detection
    Wu, Jingkai
    Jiang, Weijie
    Huang, Zhiyong
    Lin, Qifeng
    Zheng, Qinghai
    Liang, Yi
    Yu, Yuanlong
    ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 1067 - 1076
  • [10] Multi-level and multi-scale deep saliency network for salient object detection
    Zhang, Qing
    Lin, Jiajun
    Zhuge, Jingling
    Yuan, Wenhao
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 59 : 415 - 424