Improving Spatiotemporal Self-supervision by Deep Reinforcement Learning

被引:72
作者
Buechler, Uta [1 ]
Brattoli, Biagio [1 ]
Ommer, Bjoern [1 ]
机构
[1] Heidelberg Univ, HCl IWR, Heidelberg, Germany
来源
COMPUTER VISION - ECCV 2018, PT 15 | 2018年 / 11219卷
关键词
Deep reinforcement learning; Self-supervision; Shuffling; Action recognition; Image understanding;
D O I
10.1007/978-3-030-01267-0_47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised learning of convolutional neural networks can harness large amounts of cheap unlabeled data to train powerful feature representations. As surrogate task, we jointly address ordering of visual data in the spatial and temporal domain. The permutations of training samples, which are at the core of self-supervision by ordering, have so far been sampled randomly from a fixed preselected set. Based on deep reinforcement learning we propose a sampling policy that adapts to the state of the network, which is being trained. Therefore, new permutations are sampled according to their expected utility for updating the convolutional feature representation. Experimental evaluation on unsupervised and transfer learning tasks demonstrates competitive performance on standard benchmarks for image and video classification and nearest neighbor retrieval.
引用
收藏
页码:797 / 814
页数:18
相关论文
共 54 条
  • [1] Andrychowicz M, 2016, ADV NEUR IN, V29
  • [2] [Anonymous], 2017, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2017.399
  • [3] [Anonymous], 2016, INT C LEARN REPR ICL
  • [4] [Anonymous], 2017, CVPR
  • [5] [Anonymous], 2017, P IEEE COMP VIS PATT
  • [6] [Anonymous], 2017, ADV NEURAL INFORM PR, DOI DOI 10.1145/3143361.3143398
  • [7] Learning Deep Architectures for AI
    Bengio, Yoshua
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01): : 1 - 127
  • [8] Bojanowski P., 2017, INT C MACH LEARN
  • [9] Chen YT, 2017, PR MACH LEARN RES, V70
  • [10] Multi-task Self-Supervised Visual Learning
    Doersch, Carl
    Zisserman, Andrew
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2070 - 2079