LSTM-based multi-label video event detection

被引:27
作者
Liu, An-An [1 ]
Shao, Zhuang [1 ]
Wong, Yongkang [2 ]
Li, Junnan [3 ]
Su, Yu-Ting [1 ]
Kankanhalli, Mohan [4 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Natl Univ Singapore, Smart Syst Inst, Singapore, Singapore
[3] Natl Univ Singapore, NUS Grad Sch Integrat Sci & Engn, Singapore, Singapore
[4] Natl Univ Singapore, Sch Comp, Singapore, Singapore
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
Concurrent event detections; Recurrent neural network; HISTOGRAMS; RECOGNITION; FLOW;
D O I
10.1007/s11042-017-5532-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Since large-scale surveillance videos always contain complex visual events, how to generate video descriptions effectively and efficiently without human supervision has become mandatory. To address this problem, we propose a novel architecture for jointly recognizing multiple events in a given surveillance video, motivated by the sequence to sequence network. The proposed architecture can predict what happens in a video directly without the preprocessing of object detection and tracking. We evaluate several variants of the proposed architecture with different visual features on a novel dataset perpared by our group. Moreover, we compute a wide range of quantitative metrics to evaluate this architecture. We further compare it to the popular Support Vector Machine-based visual event detection method. The comparison results suggest that the proposal method can outperform the traditional computer vision pipelines for visual event detection.
引用
收藏
页码:677 / 695
页数:19
相关论文
共 50 条
  • [1] [Anonymous], ICIMCS
  • [2] [Anonymous], 2016, Adv. Neural Inf. Process. Syst
  • [3] [Anonymous], CORR
  • [4] Benfold B., 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3457, DOI 10.1109/CVPR.2011.5995667
  • [5] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [6] On very large scale test collection for landmark image search benchmarking
    Cheng, Zhiyong
    Shen, Jialie
    [J]. SIGNAL PROCESSING, 2016, 124 : 13 - 26
  • [7] Chenyou Fan, 2016, Computer Vision - ECCV 2016. 14th European Conference: Workshops. Proceedings: LNCS 9913, P459, DOI 10.1007/978-3-319-46604-0_33
  • [8] Cho K., 2014, ARXIV14061078, P1724
  • [9] Chu WS, 2015, PROC CVPR IEEE, P3584, DOI 10.1109/CVPR.2015.7298981
  • [10] Chung J., 2014, EMPIRICAL EVALUATION, DOI 10.3115/v1/w14-4012