LEARNING HIERARCHICAL SELF-ATTENTION FOR VIDEO SUMMARIZATION

被引:0
|
作者
Liu, Yen-Ting [1 ]
Li, Yu-Jhe [1 ]
Yang, Fu-En [1 ]
Chen, Shang-Fu [1 ]
Wang, Yu-Chiang Frank [1 ]
机构
[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan
来源
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年
关键词
Video Summarization; Hierarchical Structure; Attention Model; Deep Learning; Computer Vision;
D O I
10.1109/icip.2019.8803639
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
Video summarization still remains a challenging task. Due to sufficient video data on the Internet, such task draws significant attention in the vision community and benefits a wide range of applications, e.g., video retrieval, search, etc. To effectively perform video summarization by deriving the key-frames which represent the given input video, we propose a novel framework named Hierarchical Multi-Attention Network (H-MAN) which comprises the shot-level reconstruction model and multi-head attention model. While our designed attention model is two-stage hierarchical structure for producing various attention maps, we are among the first to utilize the multi-attention mechanism in the video summarization task, which brings improved performance. The quantitative and qualitative results demonstrate the effectiveness of our model, which performs favorably against state-of-the-art approaches.
引用
收藏
页码:3377 / 3381
页数:5
相关论文
共 50 条
  • [1] Self-Attention Based Video Summarization
    Li Y.
    Wang J.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (04): : 652 - 659
  • [2] Self-attention binary neural tree for video summarization
    Fu, Hao
    Wang, Hongxing
    PATTERN RECOGNITION LETTERS, 2021, 143 : 19 - 26
  • [3] Self-attention binary neural tree for video summarization
    Fu, Hao
    Wang, Hongxing
    Wang, Hongxing (ihxwang@cqu.edu.cn), 1600, Elsevier B.V. (143): : 19 - 26
  • [4] Learning multiscale hierarchical attention for video summarization
    Zhu, Wencheng
    Lu, Jiwen
    Han, Yucheng
    Zhou, Jie
    PATTERN RECOGNITION, 2022, 122
  • [5] Bi-Directional Self-Attention with Relative Positional Encoding for Video Summarization
    Lin, Jingxu
    Zhong, Sheng-hua
    2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 1161 - 1166
  • [6] Hierarchical Multimodal Attention for Deep Video Summarization
    Sanabria, Melissa
    Precioso, Frederic
    Menguy, Thomas
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7977 - 7984
  • [7] SSAN: Separable Self-Attention Network for Video Representation Learning
    Guo, Xudong
    Guo, Xun
    Lu, Yan
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12613 - 12622
  • [8] Self-Attention Guided Copy Mechanism for Abstractive Summarization
    Xu, Song
    Li, Haoran
    Yuan, Peng
    Wu, Youzheng
    He, Xiaodong
    Zhou, Bowen
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1355 - 1362
  • [9] Learning-based correspondence classifier with self-attention hierarchical network
    Mingfan Chu
    Yong Ma
    Xiaoguang Mei
    Jun Huang
    Fan Fan
    Applied Intelligence, 2023, 53 : 24360 - 24376
  • [10] Learning-based correspondence classifier with self-attention hierarchical network
    Chu, Mingfan
    Ma, Yong
    Mei, Xiaoguang
    Huang, Jun
    Fan, Fan
    APPLIED INTELLIGENCE, 2023, 53 (20) : 24360 - 24376