ROBUST AND COMPACT VIDEO DESCRIPTOR LEARNED BY DEEP NEURAL NETWORK

被引：0

作者：

Li, Yue Nan ^{[1
]}

Chen, Xue Piao ^{[1
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2017年

基金：

中国国家自然科学基金;

关键词：

Video content identification; Video fingerprinting; Video hashing; Deep neural network;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose to extract robust video descriptor by training deep neural network to automatically capture the intrinsic visual characteristics of digital video. More specifically, we first train a conditional generative model to capture the spatio-temporal correlations among visual contents and represent them as an intermediate descriptor. A non-linear encoder, with the functions of dimension reduction and error correcting, is then trained to learn a compressed yet more robust representation of the intermediate descriptor. The cascade of the conditional generative model and the encoder constitutes the building block of the deep network for learning video descriptor. As a post-processing component, the top layers of the network are trained to optimize the robustness and discriminative capability of the output descriptor. Experimental results on benchmark databases confirm that the descriptor learned by deep neural network shows excellent robustness against photometric, geometric, temporal and combined distortions, and it can attain an F-1 score of 0.982 in content identification, which is much higher than hand-engineered descriptors.

引用

页码：2162 / 2166

页数：5

共 15 条

[1] Spatio-temporal transform based video hashing
Coskun, Baris
Sankur, Bulent
Memon, Nasir
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2006, 8 (06) : 1190 - 1208
[2] Robust video hashing based on radial projections of key frames
De Roover, C
De Vleeschouwer, C
Lefèbvre, F
Macq, B
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2005, 53 (10) : 4020 - 4037
[3] A Robust and Fast Video Copy Detection System Using Content-Based Fingerprinting
Esmaeili, Mani Malek
Fatourechi, Mehrdad
Ward, Rabab Kreidieh
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2011, 6 (01) : 213 - 226
[4] Robust video fingerprinting for content-based video identification
Lee, Sunil
Yoo, Chang D.
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008, 18 (07) : 983 - 988
[5] Video Sequence Matching Based on the Invariance of Color Correlation
Lei, Yanqiang
Luo, Weiqi
Wang, Yuangen
Huang, Jiwu
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2012, 22 (09) : 1332 - 1343
[6] Twofold Video Hashing With Automatic Synchronization
Li, Mu
Monga, Vishal
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2015, 10 (08) : 1727 - 1738
[7] Compact Video Fingerprinting via Structural Graphical Models
Li, Mu
Monga, Vishal
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2013, 8 (11) : 1709 - 1721
[8] Robust Video Hashing via Multilinear Subspace Projections
Li, Mu
Monga, Vishal
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (10) : 4397 - 4409
[9] A facile soft-template synthesis of mesoporous polymeric and carbonaceous nanospheres
Liu, Jian
Yang, Tianyu
Wang, Da-Wei
Lu, Gao Qing
Zhao, Dongyuan
Qiao, Shi Zhang
[J]. NATURE COMMUNICATIONS, 2013, 4
[10] Lu J., 2009, P SPIE MED FOR SEC F, V7254, P1

← 1 2 →