Integrating Spatial and Temporal Contextual Information for Improved Video Visualization

被引:0
|
作者
Singh, Pratibha [1 ]
Kushwaha, Alok Kumar Singh [1 ]
机构
[1] Guru Ghasidas Vishwavidyalaya, Dept Comp Sci & Engn, Bilaspur, India
来源
FOURTH CONGRESS ON INTELLIGENT SYSTEMS, VOL 2, CIS 2023 | 2024年 / 869卷
关键词
Video visualization; Moment retrieval; Highlights detection; Self-attention network;
D O I
10.1007/978-981-99-9040-5_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video representation learning is crucial for various tasks, and self-attention has emerged as an effective technique for capturing long-range dependencies. However, existing methods often neglect the distinct contextual information conveyed by spatial and temporal correlations by computing pairwise correlations simultaneously along both dimensions. To address this limitation, we suggest a novel module that sequentially models spatial and temporal correlations. This enables the efficient integration of spatial contexts into temporal modeling. By incorporating this module into a 2D CNN, we develop a self-attention module network tailored for video visualization. We evaluate the effectiveness of our approach on two benchmark datasets: Charades STA and QVHighlight, which are relevant for moment retrieval and highlight detection tasks. Through extensive experimentation, our findings show that on both datasets, the self-attention element network exceeds current methods. Notably, our models consistently surpass shallower networks and those with fewer modalities, highlighting the superiority of our approach. In summary, our proposed self-attention module contributes to advancing video representation learning by effectively capturing spatial and temporal correlations. The notable improvements achieved in moment retrieval and highlight detection tasks validate the efficacy and versatility of our approach.
引用
收藏
页码:415 / 424
页数:10
相关论文
共 50 条
  • [41] Deepfake Video Detection Based on Improved CapsNet and Temporal-Spatial Features
    Lu, Tianliang
    Bao, Yuxuan
    Li, Lanting
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (01): : 715 - 740
  • [42] A Method of Camera Relationship Establishment Based on Temporal and Spatial Information of Video Clips
    Liao, Hsien-Chou
    Hsieh, Cheng-Hsiung
    Hsieh, Yi-Ming
    Wu, Chih-En
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 1, PROCEEDINGS, 2008, : 23 - 26
  • [43] Objective Perceptual Video Quality Prediction Using Spatial and Temporal Information Differences
    Elwardy, Majed
    Zepernick, Hans-Jurgen
    Chu, Thi My Chinh
    Sundstedt, Veronica
    ISCIT 2019: PROCEEDINGS OF 2019 19TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT), 2019, : 436 - 441
  • [44] Spatial Neighbor Information Assisted Motion Compensated Temporal Filter for Video Coding
    Yuan, Zikun
    Zhu, Weijia
    He, Yuwen
    Zhang, Li
    Tang, Xiaohu
    2024 PICTURE CODING SYMPOSIUM, PCS 2024, 2024,
  • [45] Effective shot boundary classification using video spatial-temporal information
    Lu, H
    Wang, B
    Xue, XY
    Tan, YP
    2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 3837 - 3840
  • [46] High resolution video inpainting based on spatial structure and temporal edge information
    Bo, Dezhi
    Ma, Ran
    Wang, Keke
    Li, Kai
    An, Ping
    OPTOELECTRONIC IMAGING AND MULTIMEDIA TECHNOLOGY VII, 2020, 11550
  • [47] Spatial-temporal error concealment with side information for standard video codecs
    Zeng, WJ
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL II, PROCEEDINGS, 2003, : 113 - 116
  • [48] TEMPORAL CHANGES IN VIDEO GAME PLAY AND SPATIAL VISUALIZATION SKILLS BY GENDER AMONG ENGINEERING STUDENTS
    Sorby, Sheryl A.
    Veurink, Norma
    Dunbar, Ronan
    JOURNAL OF WOMEN AND MINORITIES IN SCIENCE AND ENGINEERING, 2024, 30 (03)
  • [49] Video Copy Detection Based On Temporal Contextual Hashing
    Wang, Rong Bo
    Chen, Hao
    Yao, Jin Hang
    Guo, Yu Tiara
    2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 223 - 228
  • [50] Contextual Sensing: Integrating Contextual Information with Human and Technical Geo-Sensor Information for Smart Cities
    Sagl, Guenther
    Resch, Bernd
    Blaschke, Thomas
    SENSORS, 2015, 15 (07): : 17013 - 17035