VIDEOWHISPER: Toward Discriminative Unsupervised Video Feature Learning With Attention-Based Recurrent Neural Networks

被引:20
|
作者
Zhao, Na [1 ]
Zhang, Hanwang [1 ]
Hong, Richang [2 ]
Wang, Meng [2 ]
Chua, Tat-Seng [1 ]
机构
[1] Natl Univ Singapore, Sch Comp, Singapore 117417, Singapore
[2] Hefei Univ Technol, Sch Comp & Informat, Hefei 230009, Anhui, Peoples R China
基金
新加坡国家研究基金会;
关键词
Recurrent neural networks; sequence learning; unsupervised feature learning; video features; RECOGNITION;
D O I
10.1109/TMM.2017.2722687
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present VIDEOWHISPER, a novel approach for unsupervised video representation learning. Based on the observation that the frame sequence encodes the temporal dynamics of a video (e.g., object movement and event evolution), we treat the frame sequential order as a self-supervision to learn video representations. Unlike other unsupervised video feature learning methods based on frame-level feature reconstruction that is sensitive to visual variance, VIDEOWHISPER is driven by a novel video "sequence-to-whisper" learning strategy. Specifically, for each video sequence, we use a prelearned visual dictionary to generate a sequence of high-level semantics, dubbed "whisper," which can be considered as the language describing the video dynamics. In this way, we model VIDEOWHISPER as an end-to-end sequence-to-sequence learning model using attention-based recurrent neural networks. This model is trained to predict the whisper sequence and hence it is able to learn the temporal structure of videos. We propose two ways to generate video representation from the model. Through extensive experiments on two real-world video datasets, we demonstrate that video representation learned by VIDEOWHISPER is effective to boost fundamental multimedia applications such as video retrieval and event classification.
引用
收藏
页码:2080 / 2092
页数:13
相关论文
共 50 条
  • [31] Automated Labeling of Bugs and Tickets Using Attention-Based Mechanisms in Recurrent Neural Networks
    Lyubinets, Volodymyr
    Nicholas, Deon
    Boiko, Taras
    2018 IEEE SECOND INTERNATIONAL CONFERENCE ON DATA STREAM MINING & PROCESSING (DSMP), 2018, : 271 - 275
  • [32] Detection of Paroxysmal Atrial Fibrillation using Attention-based Bidirectional Recurrent Neural Networks
    Shashikumar, Supreeth P.
    Shah, Amit J.
    Clifford, Gari D.
    Nemati, Shamim
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 715 - 723
  • [33] End-to-end Language Identification using Attention-based Recurrent Neural Networks
    Geng, Wang
    Wang, Wenfu
    Zhao, Yuanyuan
    Cai, Xinyuan
    Xu, Bo
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2944 - 2948
  • [34] Deep Personalized Glucose Level Forecasting Using Attention-based Recurrent Neural Networks
    Armandpour, Mohammadreza
    Kidd, Brian
    Du, Yu
    Huang, Jianhua Z.
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [35] Music Feature Classification Based on Recurrent Neural Networks with Channel Attention Mechanism
    Gan, Jie
    MOBILE INFORMATION SYSTEMS, 2021, 2021
  • [36] Joint Supervision for Discriminative Feature Learning in Convolutional Neural Networks
    Guo, Jianyuan
    Yuan, Yuhui
    Zhang, Chao
    COMPUTER VISION, PT II, 2017, 772 : 509 - 520
  • [37] An Attention-Based Spiking Neural Network for Unsupervised Spike-Sorting
    Bernert, Marie
    Yvert, Blaise
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2019, 29 (08)
  • [38] Extended Siamese Convolutional Neural Networks for Discriminative Feature Learning
    Lee, Sangyun
    Hong, Sungjun
    INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2022, 22 (04) : 339 - 349
  • [39] Using attention-based neural networks for predicting student learning outcomes in service-learning
    Eugene Yujun Fu
    Grace Ngai
    Hong Va Leong
    Stephen C.F. Chan
    Daniel T.L. Shek
    Education and Information Technologies, 2023, 28 : 13763 - 13789
  • [40] Using attention-based neural networks for predicting student learning outcomes in service-learning
    Fu, Eugene Yujun
    Ngai, Grace
    Leong, Hong Va
    Chan, Stephen C. F.
    Shek, Daniel T. L.
    EDUCATION AND INFORMATION TECHNOLOGIES, 2023, 28 (10) : 13763 - 13789