LEARNING HIERARCHICAL SELF-ATTENTION FOR VIDEO SUMMARIZATION

被引：0

作者：

Liu, Yen-Ting ^{[1
]}

Li, Yu-Jhe ^{[1
]}

Yang, Fu-En ^{[1
]}

Chen, Shang-Fu ^{[1
]}

Wang, Yu-Chiang Frank ^{[1
]}

机构：

[1] Natl Taiwan Univ, Grad Inst Commun Engn, Taipei, Taiwan

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2019年

关键词：

Video Summarization; Hierarchical Structure; Attention Model; Deep Learning; Computer Vision;

D O I：

10.1109/icip.2019.8803639

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

Video summarization still remains a challenging task. Due to sufficient video data on the Internet, such task draws significant attention in the vision community and benefits a wide range of applications, e.g., video retrieval, search, etc. To effectively perform video summarization by deriving the key-frames which represent the given input video, we propose a novel framework named Hierarchical Multi-Attention Network (H-MAN) which comprises the shot-level reconstruction model and multi-head attention model. While our designed attention model is two-stage hierarchical structure for producing various attention maps, we are among the first to utilize the multi-attention mechanism in the video summarization task, which brings improved performance. The quantitative and qualitative results demonstrate the effectiveness of our model, which performs favorably against state-of-the-art approaches.

引用

页码：3377 / 3381

页数：5

共 50 条

[1] Self-Attention Based Video Summarization
Li Y.
Wang J.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2020, 32 (04): : 652 - 659
[2] Self-attention binary neural tree for video summarization
Fu, Hao
Wang, Hongxing
PATTERN RECOGNITION LETTERS, 2021, 143 : 19 - 26
[3] Self-attention binary neural tree for video summarization
Fu, Hao
Wang, Hongxing
Wang, Hongxing (ihxwang@cqu.edu.cn), 1600, Elsevier B.V. (143): : 19 - 26
[4] Learning multiscale hierarchical attention for video summarization
Zhu, Wencheng
Lu, Jiwen
Han, Yucheng
Zhou, Jie
PATTERN RECOGNITION, 2022, 122
[5] Bi-Directional Self-Attention with Relative Positional Encoding for Video Summarization
Lin, Jingxu
Zhong, Sheng-hua
2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 1161 - 1166
[6] Hierarchical Multimodal Attention for Deep Video Summarization
Sanabria, Melissa
Precioso, Frederic
Menguy, Thomas
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7977 - 7984
[7] SSAN: Separable Self-Attention Network for Video Representation Learning
Guo, Xudong
Guo, Xun
Lu, Yan
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 12613 - 12622
[8] Self-Attention Guided Copy Mechanism for Abstractive Summarization
Xu, Song
Li, Haoran
Yuan, Peng
Wu, Youzheng
He, Xiaodong
Zhou, Bowen
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 1355 - 1362
[9] Learning-based correspondence classifier with self-attention hierarchical network
Mingfan Chu
Yong Ma
Xiaoguang Mei
Jun Huang
Fan Fan
Applied Intelligence, 2023, 53 : 24360 - 24376
[10] Learning-based correspondence classifier with self-attention hierarchical network
Chu, Mingfan
Ma, Yong
Mei, Xiaoguang
Huang, Jun
Fan, Fan
APPLIED INTELLIGENCE, 2023, 53 (20) : 24360 - 24376

← 1 2 3 4 5 →