Unsupervised t-Distributed Video Hashing and Its Deep Hashing Extension

被引：44

作者：

Hao, Yanbin ^{[1
]}

Mu, Tingting ^{[2
]}

Goulermas, John Y. ^{[3
]}

Jiang, Jianguo ^{[1
]}

Hong, Richang ^{[1
]}

Wang, Meng ^{[1
]}

机构：

[1] Hefei Univ Technol, Sch Comp & Informat, Hefei 230009, Anhui, Peoples R China

[2] Univ Manchester, Sch Comp Sci, Manchester M13 9PL, Lancs, England

[3] Univ Liverpool, Dept Comp Sci, Liverpool L69 3BX, Merseyside, England

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2017年 / 26卷 / 11期

关键词：

Video retrieval; hashing; deep neural network; multi-view learning; unsupervised learning; Student t-distribution; MULTI-FEATURE FUSION; RETRIEVAL; ROBUST; REPRESENTATION; LOCALIZATION;

D O I：

10.1109/TIP.2017.2737329

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a novel unsupervised hashing algorithm, referred to as t-USMVH, and its extension to unsupervised deep hashing, referred to as t-UDH, are proposed to support large-scale video-to-video retrieval. To improve robustness of the unsupervised learning, the t-USMVH combines multiple types of feature representations and effectively fuses them by examining a continuous relevance score based on a Gaussian estimation over pairwise distances, and also a discrete neighbor score based on the cardinality of reciprocal neighbors. To reduce sensitivity to scale changes for mapping objects that are far apart from each other, Student t-distribution is used to estimate the similarity between the relaxed hash code vectors for keyframes. This results in more accurate preservation of the desired unsupervised similarity structure in the hash code space. By adapting the corresponding optimization objective and constructing the hash mapping function via a deep neural network, we develop a robust unsupervised training strategy for a deep hashing network. The efficiency and effectiveness of the proposed methods are evaluated on two public video collections via comparisons against multiple classical and the state-of-the-art methods.

引用

页码：5531 / 5544

页数：14

共 71 条

[1] High-dimensional indexing technologies for large scale content-based image retrieval: a review [J].

Ai, Lie-fu ;

Yu, Jun-qing ;

He, Yun-feng ;

Guan, Tao .

JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2013, 14 (07) :505-520

[2]

[Anonymous], 2014, Hashing for similarity search: A survey

[3]

[Anonymous], 2015, FEATURE LEARNING BAS

[4]

[Anonymous], 2011, INT C MULT, DOI DOI 10.1145/2072298.2072354

[5] Speeded-Up Robust Features (SURF) [J].

Bay, Herbert ;

Ess, Andreas ;

Tuytelaars, Tinne ;

Van Gool, Luc .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359

[6] Representation Learning: A Review and New Perspectives [J].

Bengio, Yoshua ;

Courville, Aaron ;

Vincent, Pascal .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828

[7]

Birchfield ST, 2005, PROC CVPR IEEE, P1158

[8] A Video Representation Using Temporal Superpixels [J].

Chang, Jason ;

Wei, Donglai ;

Fisher, John W., III .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :2051-2058

[9] The devil is in the details: an evaluation of recent feature encoding methods [J].

Chatfield, Ken ;

Lempitsky, Victor ;

Vedaldi, Andrea ;

Zisserman, Andrew .

PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2011, 2011,

[10] Multi-feature fusion based fast video flame detection [J].

Chen, Juan ;

He, Yaping ;

Wang, Jian .

BUILDING AND ENVIRONMENT, 2010, 45 (05) :1113-1122

← 1 2 3 4 5 6 7 8 →