Implicitly using Human Skeleton in Self-supervised Learning: Influence on Spatio-temporal Puzzle Solving and on Video Action Recognition

被引:0
作者
Riand, Mathieu [1 ,2 ]
Dolle, Laurent [1 ]
Le Callet, Patrick [2 ]
机构
[1] CEA Tech Pays Loire, F-44340 Bouguenais, France
[2] Nantes Univ, Lab Sci Numer Nantes, Equipe Image Percept & Interact, Nantes, France
来源
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ROBOTICS, COMPUTER VISION AND INTELLIGENT SYSTEMS (ROBOVIS) | 2021年
关键词
Self-supervised Learning; Siamese Network; Skeleton Keypoints; Action Recognition; Few-shot Learning;
D O I
10.5220/0010689500003061
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we studied the influence of adding skeleton data on top of human actions videos when performing self-supervised learning and action recognition. We show that adding this information without additional constraints actually hurts the accuracy of the network; we argue that the added skeleton is not considered by the network and seen as a noise masking part of the natural image. We bring first results on puzzle solving and video action recognition to support this hypothesis.
引用
收藏
页码:128 / 135
页数:8
相关论文
共 28 条
  • [1] Ahsan U., 2018, VIDEO JIGSAW UNSUPER
  • [2] Alwassel Humam, 2020, NEURIPS
  • [3] Bradski G, 2000, DR DOBBS J, V25, P120
  • [4] Chen T, 2020, PR MACH LEARN RES, V119
  • [5] Towards understanding action recognition
    Jhuang, Hueihan
    Gall, Juergen
    Zuffi, Silvia
    Schmid, Cordelia
    Black, Michael J.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 3192 - 3199
  • [6] Jun Wu, 2019, Journal of Physics: Conference Series, V1237, DOI 10.1088/1742-6596/1237/2/022087
  • [7] Kim D, 2019, AAAI CONF ARTIF INTE, P8545
  • [8] Kolesnikov A., 2019, P IEEECVF C COMPUTER
  • [9] Kuehne H, 2011, IEEE I CONF COMP VIS, P2556, DOI 10.1109/ICCV.2011.6126543
  • [10] Unsupervised Representation Learning by Sorting Sequences
    Lee, Hsin-Ying
    Huang, Jia-Bin
    Singh, Maneesh
    Yang, Ming-Hsuan
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 667 - 676