A W2VV++ Case Study with Automated and Interactive Text-to-Video Retrieval

被引:14
|
作者
Lokoc, Jakub [1 ]
Soucek, Tomas [1 ]
Vesely, Patrik [1 ]
Mejzlik, Frantisek [1 ]
Ji, Jiaqi [2 ]
Xu, Chaoxi [2 ]
Li, Xirong [2 ]
机构
[1] Charles Univ Prague, Fac Math & Phys, Dept Software Engn, SIRET Res Grp, Prague, Czech Republic
[2] Renmin Univ China, Sch Informat, Key Lab Data Engn & Knowledge Engn, AI & Media Comp Grp, Beijing, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
datasets; neural networks; ad-hoc search; known-item search; representation learning; IMAGE RETRIEVAL; SEARCH;
D O I
10.1145/3394171.3414002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As reported by respected evaluation campaigns focusing both on automated and interactive video search approaches, deep learning started to dominate the video retrieval area. However, the results are still not satisfactory for many types of search tasks focusing on high recall. To report on this challenging problem, we present two orthogonal task-based performance studies centered around the state-of-the-art W2VV++ query representation learning model for video retrieval. First, an ablation study is presented to investigate which components of the model are effective in two types of benchmark tasks focusing on high recall. Second, interactive search scenarios from the Video Browser Showdown are analyzed for two winning prototype systems implementing a selected variant of the model and providing additional querying and visualization components. The analysis of collected logs demonstrates that even with the state-of-the-art text search video retrieval model, it is still auspicious to integrate users into the search process for task types, where high recall is essential.
引用
收藏
页码:2553 / 2561
页数:9
相关论文
共 50 条
  • [21] Text readability within video retrieval applications: A study on CCTV analysis
    Newbold N.
    Gillam L.
    Journal of Multimedia, 2010, 5 (02): : 123 - 141
  • [22] Interactive Search vs. Automatic Search: An Extensive Study on Video Retrieval
    Phuong-Anh Nguyen
    Chong-Wah Ngo
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (02)
  • [23] CASE-STUDY - INTERACTIVE VIDEO AND LANGUAGE-LEARNING
    BANGS, P
    EDUCATIONAL & TRAINING TECHNOLOGY INTERNATIONAL, 1990, 27 (02): : 146 - 154
  • [24] CLIP2TF:Multimodal video-text retrieval for adolescent education
    Sun, Xiaoning
    Fan, Tao
    Li, Hongxu
    Wang, Guozhong
    Ge, Peien
    Shang, Xiwu
    DISPLAYS, 2024, 84
  • [25] Automated Depth Video Monitoring For Fall Reduction : A Case Study
    Kramer, Josh Brown
    Sabalka, Lucas
    Rush, Ben
    Jones, Katherine
    Nolte, Tegan
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 1188 - 1196
  • [26] Interactive Web Documentaries: A Case Study of Video Viewing Behaviour on iOtok
    Ducasse, Julie
    Kljun, Matjaz
    Attygalle, Nuwan T.
    Pucihar, Klen Copic
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2022, 38 (10) : 949 - 972
  • [27] TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval
    Liu, Yuqi
    Xiong, Pengfei
    Xu, Luhui
    Cao, Shengming
    Jin, Qin
    COMPUTER VISION - ECCV 2022, PT XIV, 2022, 13674 : 319 - 335
  • [28] Spatial-Temporal Graphs for Cross-Modal Text2Video Retrieval
    Song, Xue
    Chen, Jingjing
    Wu, Zuxuan
    Jiang, Yu-Gang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2914 - 2923
  • [29] Textbooks for the YouTube generation? A case study on the shift from text to video
    Granitz, Neil
    Kohli, Chiranjeev
    Lancellotti, Matthew P.
    JOURNAL OF EDUCATION FOR BUSINESS, 2021, 96 (05) : 299 - 307
  • [30] ADAPTATION OF EDUCATIONAL TEXT TO AN OPEN INTERACTIVE LEARNING SYSTEM: A CASE STUDY FOR RETUDIS
    Samarakou, M.
    Fylladitakis, E. D.
    Tsaganou, G.
    Gelegenis, J.
    Karolidis, D.
    Prentakis, P.
    Papadakis, A.
    PROCEEDINGS OF THE IADIS INTERNATIONAL CONFERENCE E-LEARNING 2013, 2013, : 296 - 302