TEMPORAL ACTION PROPOSAL GENERATION VIA DEEP FEATURE ENHANCEMENT

被引:0
作者
Hsieh, He-Yen [1 ]
Chen, Ding-Jie [1 ]
Liu, Tyng-Luh [1 ]
机构
[1] Acad Sinica, Inst Informat Sci, Taipei, Taiwan
来源
2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2020年
关键词
Temporal action proposal generation; Temporal convolution; Untrimmed Video;
D O I
暂无
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Temporal action proposal generation (TAPG) is a challenging problem for analyzing video content. It aims to localize the video segments which are likely to contain actions or events. Intuitively, making a satisfying prediction of these video segments is directly relies on their representation quality. A typical representation of a video segment is applying a two-stream feature, which comprises appearance and motion information. Rather than directly concatenating the two-stream features as the previous methods, we illustrate a feature-aggregation network (FA-Net) concerning the feature-relation among neighboring video segments for obtaining the high-quality representation that better characterizing the actions or events. Further, we design a feature-expansion network (FE-Net) to extract multi-granularity features for retrieving the proposals of high action-instance covering confidence. We evaluate our approach on two challenging datasets: ActivityNet-1.3 and THUMOS-14. The experiments showed that the proposed approach consistently outperforms the existing state-of-the-art TAPG methods.
引用
收藏
页码:1391 / 1395
页数:5
相关论文
共 28 条
  • [1] Soft-NMS - Improving Object Detection With One Line of Code
    Bodla, Navaneeth
    Singh, Bharat
    Chellappa, Rama
    Davis, Larry S.
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 5562 - 5570
  • [2] SST: Single-Stream Temporal Action Proposals
    Buch, Shyamal
    Escorcia, Victor
    Shen, Chuanqi
    Ghanem, Bernard
    Niebles, Juan Carlos
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6373 - 6382
  • [3] Rethinking the Faster R-CNN Architecture for Temporal Action Localization
    Chao, Yu-Wei
    Vijayanarasimhan, Sudheendra
    Seybold, Bryan
    Ross, David A.
    Deng, Jia
    Sukthankar, Rahul
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 1130 - 1139
  • [4] Chen Ding-Jie, 2019, ICCV
  • [5] Chen JW, 2019, AAAI CONF ARTIF INTE, P8167
  • [6] Chen JY, 2019, AAAI CONF ARTIF INTE, P8175
  • [7] Chen SX, 2019, AAAI CONF ARTIF INTE, P8191
  • [8] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [9] DAPs: Deep Action Proposals for Action Understanding
    Escorcia, Victor
    Heilbron, Fabian Caba
    Niebles, Juan Carlos
    Ghanem, Bernard
    [J]. COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 : 768 - 784
  • [10] Convolutional Two-Stream Network Fusion for Video Action Recognition
    Feichtenhofer, Christoph
    Pinz, Axel
    Zisserman, Andrew
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941