Query-Oriented Micro-Video Summarization

被引:0
作者
Jia, Mengzhao [1 ]
Wei, Yinwei [2 ]
Song, Xuemeng [1 ]
Sun, Teng [1 ]
Zhang, Min [3 ]
Nie, Liqiang [3 ]
机构
[1] Shandong Univ, Dept Comp Sci & Technol, Qingdao 250100, Peoples R China
[2] Monash Univ, Fac Informat Technol, Clayton, Vic 3800, Australia
[3] Harbin Inst Technol, Sch Comp, Shenzhen 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Video summarization; query suggestion; micro-video retrieval;
D O I
10.1109/TPAMI.2024.3355402
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Query-oriented micro-video summarization task aims to generate a concise sentence with two properties: (a) summarizing the main semantic of the micro-video and (b) being expressed in the form of search queries to facilitate retrieval. Despite its enormous application value in the retrieval area, this direction has barely been explored. Previous studies of summarization mostly focus on the content summarization for traditional long videos. Directly applying these studies is prone to gain unsatisfactory results because of the unique features of micro-videos and queries: diverse entities and complex scenes within a short time, semantic gaps between modalities, and various queries in distinct expressions. To specifically adapt to these characteristics, we propose a query-oriented micro-video summarization model, dubbed QMS. It employs an encoder-decoder-based transformer architecture as the skeleton. The multi-modal (visual and textual) signals are passed through two modal-specific encoders to obtain their representations, followed by an entity-aware representation learning module to identify and highlight critical entity information. As to the optimization, regarding the large semantic gaps between modalities, we assign different confidence scores according to their semantic relevance in the optimization process. Additionally, we develop a novel strategy to sample the effective target query among the diverse query set with various expressions. Extensive experiments demonstrate the superiority of the QMS scheme, on both the summarization and retrieval tasks, over several state-of-the-art methods.
引用
收藏
页码:4174 / 4187
页数:14
相关论文
共 50 条
  • [41] Plot Preservation Approach for Video Summarization
    Lim, Yeosun
    Uh, Youngjung
    Byun, Hyeran
    2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 67 - 71
  • [42] Smart Surveillance Based on Video Summarization
    Thomas, Sinnu Susan
    Gupta, Sumana
    Subramanian, Venkatesh K.
    2017 IEEE REGION 10 INTERNATIONAL SYMPOSIUM ON TECHNOLOGIES FOR SMART CITIES (IEEE TENSYMP 2017), 2017,
  • [43] Parallelizing Keyframe Extraction for Video Summarization
    Sharma, Chethan
    Sathish, P. K.
    2015 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION ENGINEERING SYSTEMS (SPACES), 2015, : 245 - 249
  • [44] Video summarization using motion descriptors
    Divakaran, A
    Peker, KA
    Sun, H
    STORAGE AND RETRIEVAL FOR MEDIA DATABASES 2001, 2001, 4315 : 517 - 522
  • [45] A Domain Independent Approach to Video Summarization
    Dash, Amanda
    Albu, Alexandra Branzan
    ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS (ACIVS 2017), 2017, 10617 : 431 - 442
  • [46] Summarization of MPEG Compressed Video Sequences
    Mendi, Engin
    Bayrak, Coskun
    ADVANCED SCIENCE LETTERS, 2011, 4 (11-12) : 3706 - 3708
  • [47] Label Distribution Learning for Video Summarization
    Liu Y.
    Tang S.
    Gao Y.
    Li Z.
    Li H.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2019, 31 (01): : 104 - 110
  • [48] The priority curve algorithm for video summarization
    Albanese, M.
    Fayzullin, M.
    Picariello, A.
    Subrahmanian, V. S.
    INFORMATION SYSTEMS, 2006, 31 (07) : 679 - 695
  • [49] Subjective assessment of consumer video summarization
    Forlines, C
    Peker, KA
    Divakaran, A
    MULTIMEDIA CONTENT ANALYSIS, MANAGEMENT, AND RETRIEVAL 2006, 2006, 6073
  • [50] Video summarization preserving dynamic content
    FX Palo Alto Laboratory, Bldg. 4, 3400 Hillview Ave., Palo Alto, CA, United States
    Proc ACM Int Multimedia Conf Exhib, 2007, (40-44): : 40 - 44