Unsupervised Video Hashing via Deep Neural Network

被引：0

作者：

Chao Ma

Yun Gu

Chen Gong

Jie Yang

Deying Feng

机构：

[1] Shanghai Jiao Tong University,Institute of Image Processing and Pattern Recognition

[2] Nanjing University of Science and Technology,School of Computer Science and Engineering

[3] Liaocheng University,undefined

来源：

Neural Processing Letters | 2018年 / 47卷

关键词：

Video hashing; Unsupervised method; Deep neural network; Spatio-temporal feature;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Hashing is a common solution for content-based multimedia retrieval by encoding high-dimensional feature vectors into short binary codes. Previous works mainly focus on image hashing problem. However, these methods can not be directly used for video hashing, as videos contain not only spatial structure within each frame, but also temporal correlation between successive frames. Several researchers proposed to handle this by encoding the extracted key frames, but these frame-based methods are time-consuming in real applications. Other researchers proposed to characterize the video by averaging the spatial features of frames and then the existing hashing methods can be adopted. Unfortunately, the sort of “video” features does not take the correlation between frames into consideration and may lead to the loss of the temporal information. Therefore, in this paper, we propose a novel unsupervised video hashing framework via deep neural network, which performs video hashing by incorporating the temporal structure as well as the conventional spatial structure. Specially, the spatial features of videos are obtained by utilizing convolutional neural network, and the temporal features are established via long-short term memory. After that, the time series pooling strategy is employed to obtain the single feature vector for each video. The obtained spatio-temporal feature can be applied to many existing unsupervised hashing methods. Experimental results on two real datasets indicate that by employing the spatio-temporal features, our hashing method significantly improves the performance of existing methods which only deploy the spatial features, and meanwhile obtains higher mean average precision compared with the state-of-the-art video hashing methods.

引用

页码：877 / 890

页数：13

共 50 条

[31] Constrained fixation point based segmentation via deep neural network [J].

Li, Gongyang ;

Liu, Zhi ;

Shi, Ran ;

Wei, Weijie .

NEUROCOMPUTING, 2019, 368 :180-187

[32] Deep Convolutional Neural Network Compression via Coupled Tensor Decomposition [J].

Sun, Weize ;

Chen, Shaowu ;

Huang, Lei ;

So, Hing Cheung ;

Xie, Min .

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2021, 15 (03) :603-616

[33] A REGRESSION APPROACH TO BINAURAL SPEECH SEGREGATION VIA DEEP NEURAL NETWORK [J].

Fan, Nana ;

Du, Jun ;

Dai, Li-Rona .

2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,

[34] LATENT SOURCE MINING IN FMRI DATA VIA DEEP NEURAL NETWORK [J].

Huang, Heng ;

Hu, Xintao ;

Han, Junwei ;

Lv, Jinglei ;

Liu, Nian ;

Guo, Lei ;

Liu, Tianming .

2016 IEEE 13TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2016, :638-641

[35] Accuracy Measurement of Deep Neural Network Accelerator via Metamorphic Testing [J].

Wang, Chaojin ;

Shen, Jian ;

Fang, Chunrong ;

Guan, Xiangsheng ;

Wu, Kaitao ;

Wang, Jiang .

2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING (AITEST), 2020, :55-61

[36] Video hashing based on appearance and attention features fusion via DBN [J].

Sun, Jiande ;

Liu, Xiaocui ;

Wan, Wenbo ;

Li, Jing ;

Zhao, Dong ;

Zhang, Huaxiang .

NEUROCOMPUTING, 2016, 213 :84-94

[37] Deep Neural Network Method of Recognizing the Critical Situations for Transport Systems by Video Images [J].

Pashchenko, F. F. ;

Amosov, O. S. ;

Amosova, S. G. ;

Ivanov, Y. S. ;

Zhiganov, S., V .

10TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT 2019) / THE 2ND INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40 2019) / AFFILIATED WORKSHOPS, 2019, 151 :675-682

[38] MPNET: An End-to-End Deep Neural Network for Object Detection in Surveillance Video [J].