ROBUST AND COMPACT VIDEO DESCRIPTOR LEARNED BY DEEP NEURAL NETWORK

被引:0
作者
Li, Yue Nan [1 ]
Chen, Xue Piao [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China
来源
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2017年
基金
中国国家自然科学基金;
关键词
Video content identification; Video fingerprinting; Video hashing; Deep neural network;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose to extract robust video descriptor by training deep neural network to automatically capture the intrinsic visual characteristics of digital video. More specifically, we first train a conditional generative model to capture the spatio-temporal correlations among visual contents and represent them as an intermediate descriptor. A non-linear encoder, with the functions of dimension reduction and error correcting, is then trained to learn a compressed yet more robust representation of the intermediate descriptor. The cascade of the conditional generative model and the encoder constitutes the building block of the deep network for learning video descriptor. As a post-processing component, the top layers of the network are trained to optimize the robustness and discriminative capability of the output descriptor. Experimental results on benchmark databases confirm that the descriptor learned by deep neural network shows excellent robustness against photometric, geometric, temporal and combined distortions, and it can attain an F-1 score of 0.982 in content identification, which is much higher than hand-engineered descriptors.
引用
收藏
页码:2162 / 2166
页数:5
相关论文
共 15 条
  • [1] Spatio-temporal transform based video hashing
    Coskun, Baris
    Sankur, Bulent
    Memon, Nasir
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (06) : 1190 - 1208
  • [2] Robust video hashing based on radial projections of key frames
    De Roover, C
    De Vleeschouwer, C
    Lefèbvre, F
    Macq, B
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2005, 53 (10) : 4020 - 4037
  • [3] A Robust and Fast Video Copy Detection System Using Content-Based Fingerprinting
    Esmaeili, Mani Malek
    Fatourechi, Mehrdad
    Ward, Rabab Kreidieh
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2011, 6 (01) : 213 - 226
  • [4] Robust video fingerprinting for content-based video identification
    Lee, Sunil
    Yoo, Chang D.
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008, 18 (07) : 983 - 988
  • [5] Video Sequence Matching Based on the Invariance of Color Correlation
    Lei, Yanqiang
    Luo, Weiqi
    Wang, Yuangen
    Huang, Jiwu
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2012, 22 (09) : 1332 - 1343
  • [6] Twofold Video Hashing With Automatic Synchronization
    Li, Mu
    Monga, Vishal
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2015, 10 (08) : 1727 - 1738
  • [7] Compact Video Fingerprinting via Structural Graphical Models
    Li, Mu
    Monga, Vishal
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2013, 8 (11) : 1709 - 1721
  • [8] Robust Video Hashing via Multilinear Subspace Projections
    Li, Mu
    Monga, Vishal
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (10) : 4397 - 4409
  • [9] A facile soft-template synthesis of mesoporous polymeric and carbonaceous nanospheres
    Liu, Jian
    Yang, Tianyu
    Wang, Da-Wei
    Lu, Gao Qing
    Zhao, Dongyuan
    Qiao, Shi Zhang
    [J]. NATURE COMMUNICATIONS, 2013, 4
  • [10] Lu J., 2009, P SPIE MED FOR SEC F, V7254, P1