Efficient Transformer for Video Summarization

被引：0

作者：

Kolmakova, Tatiana ^{[1
]}

Makarov, Ilya ^{[2
,3
]}

机构：

[1] HSE Univ, Moscow, Russia

[2] Artificial Intelligence Res Inst AIRI, Moscow, Russia

[3] NUST MISiS, AI Ctr, Moscow, Russia

来源：

ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT II | 2023年 / 14135卷

关键词：

Video Summarization; Deep Learning; Transformers; CREATION;

D O I：

10.1007/978-3-031-43078-7_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The amount of user-generated content is increasing daily. That is especially true for video content that became popular with social media like TikTok. Other internet sources keep up and easier the way for video sharing. That is why automatic tools for finding core information of content but decreasing its volume are essential. Video summarization is aimed to help with it. In this work, we propose a transformer-based approach to supervised video summarization. Previous applications of attention architectures either used lighter versions or loaded models with RNN modules, that slower computations. Our proposed framework uses all advantages of transformers. Extensive evaluation on two benchmark datasets showed that the introduced model outperform existed approaches on the SumMe dataset by 3% and shows comparable results on the TVSum dataset.

引用

页码：52 / 65

页数：14

共 57 条

[1] Abdrahimov Amir, 2022, 2022 International Russian Automation Conference (RusAutoCon), P436, DOI 10.1109/RusAutoCon54946.2022.9896386
[2] Apostolidis E., 2021, Video summarization using deep neural networks: A survey
[3] Apostolidis E., 2019, Proceedings of the 1st International Workshop on AI for Smart TV Content Production, Access and Delivery, P17
[4] Unsupervised Video Summarization via Attention-Driven Adversarial Learning
Apostolidis, Evlampios
Adamantidou, Eleni
Metsai, Alexandros, I
Mezaris, Vasileios
Patras, Ioannis
[J]. MULTIMEDIA MODELING (MMM 2020), PT I, 2020, 11961 : 492 - 504
[5] Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, 10.48550/arXiv.1409.0473]
[6] Weakly-Supervised Video Summarization Using Variational Encoder-Decoder and Web Prior
Cai, Sijia
Zuo, Wangmeng
Davis, Larry S.
Zhang, Lei
[J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 193 - 210
[7] Cho KYHY, 2014, Arxiv, DOI [arXiv:1406.1078, 10.48550/arXiv.1406.1078.]
[8] Cisco, 2020, Global networking trends report.
[9] Datt M, 2018, IEEE IMAGE PROC, P1268, DOI 10.1109/ICIP.2018.8451282
[10] Summarizing Videos with Attention
Fajtl, Jiri
Sokeh, Hajar Sadeghi
Argyriou, Vasileios
Monekosso, Dorothy
Remagnino, Paolo
[J]. COMPUTER VISION - ACCV 2018 WORKSHOPS, 2019, 11367 : 39 - 54

← 1 2 3 4 5 6 →