ROLE OF AUDIO IN VIDEO SUMMARIZATION

被引:0
|
作者
Shoer, Ibrahim [1 ]
Kopru, Berkay [1 ]
Erzin, Engin [1 ]
机构
[1] Koc Univ, Coll Engn, Multimedia Vis & Graph Grp, KUIS AI Lab, Istanbul, Turkiye
关键词
Audio-visual video summarization; canonical correlation analysis;
D O I
10.1109/ICASSPW59220.2023.10192578
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Video summarization attracts attention for efficient video representation, retrieval, and browsing to ease volume and traffic surge problems. Although video summarization mostly uses the visual channel for compaction, the benefits of audio-visual modeling appeared in recent literature. The information coming from the audio channel can be a result of audio-visual correlation in the video content. In this study, we propose a new audio-visual video summarization framework integrating four ways of audio-visual information fusion with GRU-based and attention-based networks. Furthermore, we investigate a new explainability methodology using audio-visual canonical correlation analysis (CCA) to better understand and explain the role of audio in the video summarization task. Experimental evaluations on the TVSum dataset attain F1 score and Kendall-tau score improvements for the audio-visual video summarization. Furthermore, splitting video content on TVSum and COGNIMUSE datasets based on audio-visual CCA as positively and negatively correlated videos yields a strong performance improvement over the positively correlated videos for audio-only and audio-visual video summarization.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] An Audio-video Summarization Scheme Based on Audio and Video Analysis
    Furini, Marco
    Ghini, Vittorio
    2006 3RD IEEE CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE, VOLS 1-3, 2006, : 1209 - +
  • [2] A audio-visual model for efficient video summarization
    El-Nagar, Gamal
    El-Sawy, Ahmed
    Rashad, Metwally
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100
  • [3] Auto-summarization of audio-video presentations
    He, LW
    Sanocki, E
    Gupta, A
    Grudin, J
    ACM MULTIMEDIA 99, PROCEEDINGS, 1999, : 489 - 498
  • [4] Automated MPEG audio-video summarization and description
    Sugano, M
    Nakajima, Y
    Yanagihara, H
    2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2002, : 956 - 959
  • [5] AUTOMATIC CONSUMER VIDEO SUMMARIZATION BY AUDIO AND VISUAL ANALYSIS
    Jiang, Wei
    Cotton, Courtenay
    Loui, Alexander C.
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [6] An enhanced video summarization system using audio features for a personal video recorder
    Otsuka, I
    Radhakrishnan, R
    Siracusa, M
    Divakaran, A
    Mishima, H
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2006, 52 (01) : 168 - 172
  • [7] Optimized deep learning enabled lecture audio video summarization
    Kaur, Preet Chandan
    Ragha, Leena
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 104
  • [8] VIDEO EVENT DETECTION AND SUMMARIZATION USING AUDIO, VISUAL AND TEXT SALIENCY
    Evangelopoulos, G.
    Zlatintsi, A.
    Skoumas, G.
    Rapantzikos, K.
    Potamianos, A.
    Maragos, P.
    Avrithis, Y.
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3553 - +
  • [9] Enhanced On-Device Video Summarization Using Audio and Visual Features
    Nagaraju, Lokesh Kumar Thandaga
    Ranjitha, B.
    Shaik, Jani Basha
    COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT I, 2024, 2009 : 86 - 98
  • [10] Attention-Based Audio-Visual Fusion for Video Summarization
    Fang, Yinghong
    Zhang, Junpeng
    Lu, Cewu
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 328 - 340