A novel multi-modal neural network approach for dynamic and generic sports video summarization

被引:4
|
作者
Narwal, Pulkit [1 ]
Duhan, Neelam [1 ]
Bhatia, Komal Kumar [1 ]
机构
[1] JC Bose Univ Sci & Technol YMCA, Comp Engn Dept, Faridabad, India
关键词
Video segmentation; Key segment; Cricket; Deep learning; Dynamic summary; Video summarization;
D O I
10.1016/j.engappai.2023.106964
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Video Summarization is a video compression/compaction technique to create a shorter yet informative version of original video. Video summarization has offered solutions to plenty of media, user and engineering applications. Though sports video summarization has been an active research topic for some time; there still exists a void for multi-modal, dynamic, generic and domain knowledge based approach for Cricket Sport video summarization. This paper presents a multi-modal video summarization approach to summarize Cricket sport videos. This work captures the domain knowledge acquired from multi-modal (audio-visual) cues. A dual neural network architecture pipeline is proposed to dynamically segment and dynamically summarize Cricket videos for generic target audience. The former Neural Network is grounded on Cricket bowling activity (visual feature) for dynamic video segmentation of Cricket videos. The segments are then forwarded to the latter Neural Network for identification of key segments. The key segment detection module relies on Audio analysis of Cricket video stream to identify exciting, content representative and informative segments as per Cricket domain. Experimental analysis on two novel proposed benchmark datasets, i.e. DPCS (Delivery Play Cricket Sport) image dataset and EXINP (Excited Interval Normal Play) Cricket Dataset (audio based) shows promising results. The results indicate that the proposed multi-modal approach generates exciting, content representative, informative, generic and dynamic summary incorporating domain knowledge of the sport.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] A novel multi-modal neural network approach for dynamic and generic sports video summarization (vol 126,106964,2023)
    Narwal, Pulkit
    Duhan, Neelam
    Bhatia, Komal Kumar
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
  • [2] Multi-modal Video Summarization
    Huang, Jia-Hong
    ICMR 2024 - Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024, : 1214 - 1218
  • [3] Multi-modal Video Summarization
    Huang, Jia-Hong
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1214 - 1218
  • [4] Hierarchical multi-modal video summarization with dynamic sampling
    Yu, Lingjian
    Zhao, Xing
    Xie, Liang
    Liang, Haoran
    Liang, Ronghua
    IET IMAGE PROCESSING, 2024, 18 (14) : 4577 - 4588
  • [5] A generic neural network for multi-modal sensorimotor learning
    Carenzi, F
    Bendahan, P
    Roschin, VY
    Frolov, AA
    Gorce, P
    Maier, MA
    COMPUTATIONAL NEUROSCIENCE: TRENDS IN RESEARCH 2004, 2004, : 525 - 533
  • [6] A generic neural network for multi-modal sensorimotor learning
    Carenzi, F
    Bendahan, P
    Roschin, VY
    Frolov, AA
    Gorce, P
    Maier, MA
    NEUROCOMPUTING, 2004, 58 : 525 - 533
  • [7] MMSS: Multi-modal story-oriented video summarization
    Pan, JY
    Yang, H
    Faloutsos, C
    FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 491 - 494
  • [8] Multi-modal anchor adaptation learning for multi-modal summarization
    Chen, Zhongfeng
    Lu, Zhenyu
    Rong, Huan
    Zhao, Chuanjun
    Xu, Fan
    NEUROCOMPUTING, 2024, 570
  • [9] A Survey on Multi-modal Summarization
    Jangra, Anubhav
    Mukherjee, Sourajit
    Jatowt, Adam
    Saha, Sriparna
    Hasanuzzaman, Mohammad
    ACM COMPUTING SURVEYS, 2023, 55 (13S)
  • [10] Towards Video Captioning with Naming: A Novel Dataset and a Multi-modal Approach
    Pini, Stefano
    Cornia, Marcella
    Baraldi, Lorenzo
    Cucchiara, Rita
    IMAGE ANALYSIS AND PROCESSING (ICIAP 2017), PT II, 2017, 10485 : 384 - 395