A W2VV++ Case Study with Automated and Interactive Text-to-Video Retrieval

被引：14

作者：

Lokoc, Jakub ^{[1
]}

Soucek, Tomas ^{[1
]}

Vesely, Patrik ^{[1
]}

Mejzlik, Frantisek ^{[1
]}

Ji, Jiaqi ^{[2
]}

Xu, Chaoxi ^{[2
]}

Li, Xirong ^{[2
]}

机构：

[1] Charles Univ Prague, Fac Math & Phys, Dept Software Engn, SIRET Res Grp, Prague, Czech Republic

[2] Renmin Univ China, Sch Informat, Key Lab Data Engn & Knowledge Engn, AI & Media Comp Grp, Beijing, Peoples R China

来源：

MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA | 2020年

基金：

北京市自然科学基金; 中国国家自然科学基金;

关键词：

datasets; neural networks; ad-hoc search; known-item search; representation learning; IMAGE RETRIEVAL; SEARCH;

D O I：

10.1145/3394171.3414002

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As reported by respected evaluation campaigns focusing both on automated and interactive video search approaches, deep learning started to dominate the video retrieval area. However, the results are still not satisfactory for many types of search tasks focusing on high recall. To report on this challenging problem, we present two orthogonal task-based performance studies centered around the state-of-the-art W2VV++ query representation learning model for video retrieval. First, an ablation study is presented to investigate which components of the model are effective in two types of benchmark tasks focusing on high recall. Second, interactive search scenarios from the Video Browser Showdown are analyzed for two winning prototype systems implementing a selected variant of the model and providing additional querying and visualization components. The analysis of collected logs demonstrates that even with the state-of-the-art text search video retrieval model, it is still auspicious to integrate users into the search process for task types, where high recall is essential.

引用

页码：2553 / 2561

页数：9

共 50 条

[41] An empirical study of excitation and aggregation design adaptions in CLIP4Clip for video-text retrieval
Jing, Xiaolun
Yang, Genke
Chu, Jian
NEUROCOMPUTING, 2024, 596
[42] Visualizing Robot Behaviors as Automated Video Annotations: A Case Study in Robot Soccer
Zhu, Danny
Veloso, Manuela
2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 6408 - 6413
[43] S2CA: Shared Concept Prototypes and Concept-level Alignment for text-video retrieval
Li, Yuxiao
Xin, Yu
Qian, Jiangbo
Dong, Yihong
NEUROCOMPUTING, 2025, 614
[44] Extending a DBMS to support content-based video retrieval: A Formula 1 case study
Petkovic, M
Mihajlovic, V
Jonker, W
XML-BASED DATA MANAGEMENT AND MULTIMEDIA ENGINEERING-EDBT 2002 WORKSHOPS, 2002, 2490 : 318 - 341
[45] An automated new approach in fast text classification (fastText): A case study for Turkish text classification without pre-processing
Kuyumcu, Birol
Aksakalli, Cuneyt
Delil, Selman
NLPIR 2019: 2019 3RD INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, 2019, : 1 - 4
[46] A Case Study of Text and Discourse Based on Juku Automated Scoring System and Coh-Metrix Automated Computational Evaluation Tool
Shen, Ying
Qin, Shikan
2015 2ND INTERNATIONAL CONFERENCE ON CREATIVE EDUCATION (ICCE 2015), PT 1, 2015, 10 : 247 - 252
[47] Model-based Interactive Semantic Parsing: A Unified Framework and A Text-to-SQL Case Study
Yao, Ziyu
Su, Yu
Sun, Huan
Yih, Wen-tau
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5447 - 5458
[48] Chiastic Structuring of Large Text 2 Nephi as a Case Study
Reynolds, Noel B.
BYU STUDIES QUARTERLY, 2020, 59 (02) : 177 - 192
[49] Multilingual Video Indexing and Retrieval Employing an Information Extraction Tool for Turkish News Texts: A Case Study
Kucuk, Dilek
Yazici, Adnan
FLEXIBLE QUERY ANSWERING SYSTEMS, 2011, 7022 : 128 - +
[50] TEACHING NATURAL AND SOCIAL SCIENCES THROUGH INTERACTIVE VIDEO CONFERENCING. CASE STUDY IN ELEMENTARY EDUCATION
Saez Lopez, Jose Manuel
Ruiz Gallardo, Jose-Reyes
PIXEL-BIT- REVISTA DE MEDIOS Y EDUCACION, 2014, (44): : 35 - 49

← 1 2 3 4 5 →