Fast coarse-to-fine video retrieval via shot-level statistics

被引:0
|
作者
Ho, YH [1 ]
Lin, CW [1 ]
Chen, JF [1 ]
Liao, HYM [1 ]
机构
[1] Natl Chung Cheng Univ, Dept Comp Sci & Informat Engn, Chiayi 621, Taiwan
关键词
video retrieval; query by clip; video matching; coarse-to-fine search; video database;
D O I
10.1117/12.631379
中图分类号
TB8 [摄影技术];
学科分类号
0804 ;
摘要
We propose a fast coarse-to-fine video retrieval scheme using shot-level spatio-temporal statistics. The proposed scheme consists of a two-step coarse search and a fine search. At the coarse-search stage, the shot-level motion and color distributions are computed as the spatio-temporal features for shot matching. The first-pass coarse search uses the shot-level global statistics to cut down the size of the search space drastically. By adding an adjacent shot of the first query shot, the second-pass coarse-search introduces the "causality" relation between two consecutive shots to improve the search accuracy. As a result, the final fine-search step based on local color features of key-frames of the query shot is performed to further refine the search result. Experimental results show that the proposed methods can achieve good retrieval performance with a much reduced complexity compared to single-pass methods.
引用
收藏
页码:239 / 250
页数:12
相关论文
共 50 条
  • [1] Fast coarse-to-fine video retrieval using shot-level spatio-temporal statistics
    Ho, Yu-Hsuan
    Lin, Chia-Wen
    Chen, Jing-Fung
    Liao, Hong-Yuan Mark
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2006, 16 (05) : 642 - 648
  • [2] Unified Coarse-to-Fine Alignment for Video-Text Retrieval
    Wang, Ziyang
    Sung, Yi-Lin
    Cheng, Feng
    Bertasius, Gedas
    Bansal, Mohit
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 2804 - 2815
  • [3] Coarse-to-fine dual-level attention for video-text cross modal retrieval
    Jin, Ming
    Zhang, Huaxiang
    Zhu, Lei
    Sun, Jiande
    Liu, Li
    KNOWLEDGE-BASED SYSTEMS, 2022, 242
  • [4] Unsupervised Video Summarization via Deep Reinforcement Learning With Shot-Level Semantics
    Yuan, Ye
    Zhang, Jiawan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (01) : 445 - 456
  • [5] COARSE-TO-FINE VIDEO TEXT DETECTION
    Miao, Guangyi
    Huang, Qingming
    Jiang, Shuqiang
    Gao, Wen
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 569 - +
  • [6] Learning Coarse-to-Fine Graph Neural Networks for Video-Text Retrieval
    Wang, Wei
    Gao, Junyu
    Yang, Xiaoshan
    Xu, Changsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 2386 - 2397
  • [7] Shot-level description and matching of video content.
    Ronfard, R
    MULTIMEDIA STORAGE AND ARCHIVING SYSTEMS II, 1997, 3229 : 70 - 78
  • [8] Fast video retrieval via the statistics of motion
    Chen, JF
    Liao, HYM
    Lin, CW
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 437 - 440
  • [9] Zero-shot visual grounding via coarse-to-fine representation learning
    Mi, Jinpeng
    Jin, Shaofei
    Chen, Zhiqian
    Liu, Dan
    Wei, Xian
    Zhang, Jianwei
    NEUROCOMPUTING, 2024, 610
  • [10] Transductive Zero-Shot Hashing via Coarse-to-Fine Similarity Mining
    Lai, Hanjiang
    ICMR '18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2018, : 196 - 203