A novel multi-modal neural network approach for dynamic and generic sports video summarization

被引：4

作者：

Narwal, Pulkit ^{[1
]}

Duhan, Neelam ^{[1
]}

Bhatia, Komal Kumar ^{[1
]}

机构：

[1] JC Bose Univ Sci & Technol YMCA, Comp Engn Dept, Faridabad, India

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2023年 / 126卷

关键词：

Video segmentation; Key segment; Cricket; Deep learning; Dynamic summary; Video summarization;

D O I：

10.1016/j.engappai.2023.106964

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Video Summarization is a video compression/compaction technique to create a shorter yet informative version of original video. Video summarization has offered solutions to plenty of media, user and engineering applications. Though sports video summarization has been an active research topic for some time; there still exists a void for multi-modal, dynamic, generic and domain knowledge based approach for Cricket Sport video summarization. This paper presents a multi-modal video summarization approach to summarize Cricket sport videos. This work captures the domain knowledge acquired from multi-modal (audio-visual) cues. A dual neural network architecture pipeline is proposed to dynamically segment and dynamically summarize Cricket videos for generic target audience. The former Neural Network is grounded on Cricket bowling activity (visual feature) for dynamic video segmentation of Cricket videos. The segments are then forwarded to the latter Neural Network for identification of key segments. The key segment detection module relies on Audio analysis of Cricket video stream to identify exciting, content representative and informative segments as per Cricket domain. Experimental analysis on two novel proposed benchmark datasets, i.e. DPCS (Delivery Play Cricket Sport) image dataset and EXINP (Excited Interval Normal Play) Cricket Dataset (audio based) shows promising results. The results indicate that the proposed multi-modal approach generates exciting, content representative, informative, generic and dynamic summary incorporating domain knowledge of the sport.

引用

页数：15

共 50 条

[1] A novel multi-modal neural network approach for dynamic and generic sports video summarization (vol 126,106964,2023)
Narwal, Pulkit
Duhan, Neelam
Bhatia, Komal Kumar
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
[2] Multi-modal Video Summarization
Huang, Jia-Hong
ICMR 2024 - Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024, : 1214 - 1218
[3] Multi-modal Video Summarization
Huang, Jia-Hong
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 1214 - 1218
[4] Hierarchical multi-modal video summarization with dynamic sampling
Yu, Lingjian
Zhao, Xing
Xie, Liang
Liang, Haoran
Liang, Ronghua
IET IMAGE PROCESSING, 2024, 18 (14) : 4577 - 4588
[5] A generic neural network for multi-modal sensorimotor learning
Carenzi, F
Bendahan, P
Roschin, VY
Frolov, AA
Gorce, P
Maier, MA
COMPUTATIONAL NEUROSCIENCE: TRENDS IN RESEARCH 2004, 2004, : 525 - 533
[6] A generic neural network for multi-modal sensorimotor learning
Carenzi, F
Bendahan, P
Roschin, VY
Frolov, AA
Gorce, P
Maier, MA
NEUROCOMPUTING, 2004, 58 : 525 - 533
[7] MMSS: Multi-modal story-oriented video summarization
Pan, JY
Yang, H
Faloutsos, C
FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 491 - 494
[8] Multi-modal anchor adaptation learning for multi-modal summarization
Chen, Zhongfeng
Lu, Zhenyu
Rong, Huan
Zhao, Chuanjun
Xu, Fan
NEUROCOMPUTING, 2024, 570
[9] A Survey on Multi-modal Summarization
Jangra, Anubhav
Mukherjee, Sourajit
Jatowt, Adam
Saha, Sriparna
Hasanuzzaman, Mohammad
ACM COMPUTING SURVEYS, 2023, 55 (13S)
[10] Towards Video Captioning with Naming: A Novel Dataset and a Multi-modal Approach
Pini, Stefano
Cornia, Marcella
Baraldi, Lorenzo
Cucchiara, Rita
IMAGE ANALYSIS AND PROCESSING (ICIAP 2017), PT II, 2017, 10485 : 384 - 395

← 1 2 3 4 5 →