Interactive Video Retrieval in the Age of Deep Learning - Detailed Evaluation of VBS 2019

被引:31
作者
Rossetto, Luca [1 ]
Gasser, Ralph [2 ]
Lokoc, Jakub [3 ]
Bailer, Werner [4 ]
Schoeffmann, Klaus [5 ]
Muenzer, Bernd [5 ]
Soucek, Tomas [3 ]
Nguyen, Phuong Anh [6 ]
Bolettieri, Paolo [7 ]
Leibetseder, Andreas [5 ]
Vrochidis, Stefanos [8 ]
机构
[1] Univ Zurich, CH-8006 Zurich, Switzerland
[2] Univ Basel, CH-4001 Basel, Switzerland
[3] Charles Univ Prague, Prague 11000, Czech Republic
[4] JOANNEUM RES, DIGITAL, A-8010 Graz, Austria
[5] Alpen Adria Univ Klagenfurt, A-9020 Klagenfurt, Austria
[6] City Univ Hong Kong, Hong Kong 999077, Peoples R China
[7] CNR, I-56124 Pisa, Italy
[8] Ctr Res & Technol Hellas, Informat Technol Inst, Thessaloniki 57001, Greece
基金
欧盟地平线“2020”;
关键词
Task analysis; Visualization; Browsers; Annotations; Deep learning; Semantics; Tools; Interactive video retrieval; video browsing; video content analysis; content-based retrieval; evaluations; IMAGE;
D O I
10.1109/TMM.2020.2980944
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the fact that automatic content analysis has made remarkable progress over the last decade - mainly due to significant advances in machine learning - interactive video retrieval is still a very challenging problem, with an increasing relevance in practical applications. The Video Browser Showdown (VBS) is an annual evaluation competition that pushes the limits of interactive video retrieval with state-of-the-art tools, tasks, data, and evaluation metrics. In this paper, we analyse the results and outcome of the 8th iteration of the VBS in detail. We first give an overview of the novel and considerably larger V3C1 dataset and the tasks that were performed during VBS 2019. We then go on to describe the search systems of the six international teams in terms of features and performance. And finally, we perform an in-depth analysis of the per-team success ratio and relate this to the search strategies that were applied, the most popular features, and problems that were experienced. A large part of this analysis was conducted based on logs that were collected during the competition itself. This analysis gives further insights into the typical search behavior and differences between expert and novice users. Our evaluation shows that textual search and content browsing are the most important aspects in terms of logged user interactions. Furthermore, we observe a trend towards deep learning based features, especially in the form of labels generated by artificial neural networks. But nevertheless, for some tasks, very specific content-based search features are still being used. We expect these findings to contribute to future improvements of interactive video search systems.
引用
收藏
页码:243 / 256
页数:14
相关论文
共 87 条
  • [1] Albitar S., 2012, INT C WEB INF SYST E, P326
  • [2] Amato G., 2017, P 15 INT WORKSH CONT, V26, P1
  • [3] VISIONE at VBS2019
    Amato, Giuseppe
    Bolettieri, Paolo
    Carrara, Fabio
    Debole, Franca
    Falchi, Fabrizio
    Gennaro, Claudio
    Vadicamo, Lucia
    Vairo, Claudio
    [J]. MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 591 - 596
  • [4] Deep Permutations: Deep Convolutional Neural Networks and Permutation-Based Indexing
    Amato, Giuseppe
    Falchi, Fabrizio
    Gennaro, Claudio
    Vadicamo, Lucia
    [J]. SIMILARITY SEARCH AND APPLICATIONS, SISAP 2016, 2016, 9939 : 93 - 106
  • [5] VERGE in VBS 2019
    Andreadis, Stelios
    Moumtzidou, Anastasia
    Galanopoulos, Damianos
    Markatopoulou, Foteini
    Apostolidis, Konstantinos
    Mavropoulos, Thanassis
    Gialampoukidis, Ilias
    Vrochidis, Stefanos
    Mezaris, Vasileios
    Kompatsiaris, Ioannis
    Patras, Ioannis
    [J]. MULTIMEDIA MODELING, MMM 2019, PT II, 2019, 11296 : 602 - 608
  • [6] [Anonymous], 2012, P 8 INT C LANG RES E
  • [7] [Anonymous], 2017, ARXIV170701340
  • [8] [Anonymous], 2016, LECT NOTES COMPUT SC, DOI DOI 10.1007/978-3-319-43946-4_14
  • [9] [Anonymous], ICMR19 P 2019 ACM
  • [10] [Anonymous], 2019, WORLD WIDE WEB, DOI DOI 10.1007/s11280-018-0541-x