Shot Boundary Detection with Augmented Annotations

被引:0
作者
Esteve Brotons, Miguel Jose [1 ]
Carmona Blanco, Jorge [1 ]
Javier Lucendo, Francisco [1 ]
Garcia-Rodriguez, Jose [2 ]
机构
[1] Telefon I D, Madrid, Spain
[2] Univ Alicante, Comp Technol Dept, Alicante, Spain
来源
ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2023, PT I | 2023年 / 14134卷
关键词
Shot boundary detection; augmented annotations; visual inspection;
D O I
10.1007/978-3-031-43085-5_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep learning approaches have been considered to provide state-of-the-art results in shot boundary detection. These approaches revolve around the need for large annotated datasets. The quality of the annotations is crucial to the robustness of the algorithm. Having graphical tools to verify the correct annotation of the original datasets, as well as the correct generation of synthetic datasets is a must. In this paper we propose a framework that allow the visual inspection of the datasets, incorporating the option of editing the annotations manually, as well as annotations from other algorithms, generating a set of augmented annotations. In addition, we benchmark the performance of TransNet in three scenarios, 1) using the datasets with their original annotations, 2) using automatically generated annotations, and 3) using the combination of the previous annotations, as augmented annotations. We conclude that the usage of augmented annotations significantly improves the network results.
引用
收藏
页码:234 / 250
页数:17
相关论文
共 16 条
  • [1] [Anonymous], 2017, TREC VIDEO RETRIEVAL
  • [2] A Deep Siamese Network for Scene Detection in Broadcast Videos
    Baraldi, Lorenzo
    Grana, Costantino
    Cucchiara, Rita
    [J]. MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1199 - 1202
  • [3] Shot and Scene Detection via Hierarchical Clustering for Re-using Broadcast Video
    Baraldi, Lorenzo
    Grana, Costantino
    Cucchiara, Rita
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS, CAIP 2015, PT I, 2015, 9256 : 801 - 811
  • [4] Guzhov A., 2020, arXiv
  • [5] Gygli M, 2017, Arxiv, DOI [arXiv:1705.08214, 10.48550/ARXIV.1705.08214, DOI 10.48550/ARXIV.1705.08214]
  • [6] Hassanien A, 2017, Arxiv, DOI arXiv:1705.03281
  • [7] 3D Convolutional Neural Networks for Human Action Recognition
    Ji, Shuiwang
    Xu, Wei
    Yang, Ming
    Yu, Kai
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (01) : 221 - 231
  • [8] Large-scale Video Classification with Convolutional Neural Networks
    Karpathy, Andrej
    Toderici, George
    Shetty, Sanketh
    Leung, Thomas
    Sukthankar, Rahul
    Fei-Fei, Li
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1725 - 1732
  • [9] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90
  • [10] Simonyan K, 2014, Arxiv, DOI arXiv:1406.2199