Video Fingerprinting via Deep Metric Learning

被引:0
作者
Li X. [1 ]
Xu L. [1 ]
Yang Y. [1 ]
Fei S. [2 ]
机构
[1] School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo
[2] School of Automation, Southeast University, Nanjing
来源
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics | 2020年 / 32卷 / 09期
关键词
Deep metric learning; Double loss function; End-to-end; Multi-layer feature fusion;
D O I
10.3724/SP.J.1089.2020.18102
中图分类号
学科分类号
摘要
In order to improve the compactness, an end-to-end video fingerprinting via deep metric learning is proposed while ensuring its robustness and distinctness. The whole framework is composed of weight-sharing triplet networks. The improved 3D residual network is employed to be the main branch, which fuses multi-layer features together and compresses it. This process maps the raw data to compact fingerprints directly. The new designed boundary-constrained triple angle metric loss and classification loss compose the objective function. The new triple loss overcomes deficient expression to feature correlation. The classification loss function remedies the metric loss which is not sensitive to the overall distribution of sample features. A large number of experiments have been carried out on the FCVID set for the proposed algorithms, traditional methods and deep learning methods. The results show that the algorithm enhances compactness significantly while improving the robustness and distinctness simultaneously. © 2020, Beijing China Science Journal Publishing Co. Ltd. All right reserved.
引用
收藏
页码:1411 / 1419
页数:8
相关论文
共 21 条
[1]  
Li Y N, Wang D D, Tang L L., Robust and secure image fingerprinting learned by neural network, IEEE Transactions on Circuits and Systems for Video Technology, 30, 2, pp. 362-375, (2019)
[2]  
Ferman A M, Tekalp A M, Mehrotra R., Robust color histogram descriptors for video segment retrieval and identification, IEEE Transactions on Image Processing, 11, 5, pp. 497-508, (2002)
[3]  
Bhat D N, Nayar S K., Ordinal measures for image correspondence, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 4, pp. 415-423, (1998)
[4]  
Lee S, Yoo C D., Video fingerprinting based on centroids of gradient orientations, Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing, pp. 401-404, (2006)
[5]  
de Roover C, de Vleeschouwer C, Lefebvre F, Et al., Robust video hashing based on radial projections of key frames, IEEE Transactions on Signal Processing, 53, 10, pp. 4020-4037, (2005)
[6]  
Li M, Monga V., Compact video fingerprinting via structural graphical models, IEEE Transactions on Information Forensics and Security, 8, 11, pp. 1709-1721, (2013)
[7]  
Wang L, Bao Y, Li H J, Et al., Compact CNN based video representation for efficient video copy detection, Proceedings of International Conference on Multimedia Modeling, pp. 576-587, (2017)
[8]  
Kordopatis-Zilos G, Papadopoulos S, Patras I, Et al., Near-duplicate video retrieval by aggregating intermediate CNN layers, Proceedings of International Conference on Multimedia Modeling, pp. 251-263, (2017)
[9]  
Zhang X, Xie Y X, Luan X D, Et al., Video copy detection based on deep CNN features and graph-based sequence matching, Wireless Personal Communications, 103, pp. 401-416, (2018)
[10]  
Zhou Z L, Chen J C, Yang C N, Et al., Video copy detection using spatio-temporal CNN features, IEEE Access, 7, pp. 100658-100665, (2019)