Visual Reinforcement Learning With Self-Supervised 3D Representations

被引:10
|
作者
Ze, Yanjie [1 ,2 ]
Hansen, Nicklas [2 ]
Chen, Yinbo [2 ]
Jain, Mohit [2 ]
Wang, Xiaolong [2 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai 200240, Peoples R China
[2] Univ Calif San Diego, San Diego, CA 92093 USA
关键词
Three-dimensional displays; Task analysis; Visualization; Cameras; Representation learning; Training; Robot vision systems; Reinforcement learning; representation learning; deep learning for visual perception;
D O I
10.1109/LRA.2023.3259681
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
A prominent approach to visual Reinforcement Learning (RL) is to learn an internal state representation using self-supervised methods, which has the potential benefit of improved sample-efficiency and generalization through additional learning signal and inductive biases. However, while the real world is inherently 3D, prior efforts have largely been focused on leveraging 2D computer vision techniques as auxiliary self-supervision. In this work, we present a unified framework for self-supervised learning of 3D representations for motor control. Our proposed framework consists of two phases: a pretraining phase where a deep voxel-based 3D autoencoder is pretrained on a large object-centric dataset, and a finetuning phase where the representation is jointly finetuned together with RL on in-domain data. We empirically show that our method enjoys improved sample efficiency compared to 2D representation learning methods. Additionally, our learned policies transfer zero-shot to a real robot setup with only approximate geometric correspondence, and successfully solve motor control tasks that involve grasping and lifting from a single, uncalibrated RGB camera.
引用
收藏
页码:2890 / 2897
页数:8
相关论文
共 50 条
  • [1] Imbalance-Aware Self-supervised Learning for 3D Radiomic Representations
    Li, Hongwei
    Xue, Fei-Fei
    Chaitanya, Krishna
    Luo, Shengda
    Ezhov, Ivan
    Wiestler, Benedikt
    Zhang, Jianguo
    Menze, Bjoern
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 36 - 46
  • [2] Learning Action Representations for Self-supervised Visual Exploration
    Oh, Changjae
    Cavallaro, Andrea
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 5873 - 5879
  • [3] Self-Supervised Representations for Multi-View Reinforcement Learning
    Yang, Huanhuan
    Shi, Dianxi
    Xie, Guojun
    Peng, Yingxuan
    Zhang, Yi
    Yang, Yantai
    Yang, Shaowu
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 2203 - 2213
  • [4] Self-Supervised Visual Representations Learning by Contrastive Mask Prediction
    Zhao, Yucheng
    Wang, Guangting
    Luo, Chong
    Zeng, Wenjun
    Zha, Zheng-Jun
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10140 - 10149
  • [5] Towards Efficient and Effective Self-supervised Learning of Visual Representations
    Addepalli, Sravanti
    Bhogale, Kaushal
    Dey, Priyam
    Babu, R. Venkatesh
    COMPUTER VISION, ECCV 2022, PT XXXI, 2022, 13691 : 523 - 538
  • [6] ROLL: Visual Self-Supervised Reinforcement Learning with Object Reasoning
    Wang, Yufei
    Narasimhan, Gautham Narayan
    Lin, Xingyu
    Okorn, Brian
    Held, David
    CONFERENCE ON ROBOT LEARNING, VOL 155, 2020, 155 : 1030 - 1048
  • [7] 3D Human Pose Machines with Self-Supervised Learning
    Wang, Keze
    Lin, Liang
    Jiang, Chenhan
    Qian, Chen
    Wei, Pengxu
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (05) : 1069 - 1082
  • [8] Self-Supervised Learning of Detailed 3D Face Reconstruction
    Chen, Yajing
    Wu, Fanzi
    Wang, Zeyu
    Song, Yibing
    Ling, Yonggen
    Bao, Linchao
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 8696 - 8705
  • [9] Self-Supervised Online Learning of Appearance for 3D Tracking
    Lee, Bhoram
    Lee, Daniel D.
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 4930 - 4937
  • [10] Self-Supervised Deep Learning for 3D Gravity Inversion
    Li, Yinshuo
    Jia, Zhuo
    Lu, Wenkai
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60