Visual Affordance Prediction for Guiding Robot Exploration

被引:2
作者
Bharadhwaj, Homanga [1 ]
Gupta, Abhinav [1 ]
Tulsiani, Shubham [1 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
来源
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA | 2023年
关键词
D O I
10.1109/ICRA48891.2023.10161288
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Motivated by the intuitive understanding humans have about the space of possible interactions, and the ease with which they can generalize this understanding to previously unseen scenes, we develop an approach for learning 'visual affordances'. Given an input image of a scene, we infer a distribution over plausible future states that can be achieved via interactions with it. To allow predicting diverse plausible futures, we discretize the space of continuous images with a VQ-VAE and use a Transformer-based model to learn a conditional distribution in the latent embedding space. We show that these models can be trained using large-scale and diverse passive data, and that the learned models exhibit compositional generalization to diverse objects beyond the training distribution. We evaluate the quality and diversity of the generations, and demonstrate how the trained affordance model can be used for guiding exploration during visual goal-conditioned policy learning in robotic manipulation.
引用
收藏
页码:3029 / 3036
页数:8
相关论文
共 50 条
  • [31] Effects of broken affordance on visual extinction
    Wulff, Melanie
    Humphreys, Glyn W.
    FRONTIERS IN HUMAN NEUROSCIENCE, 2015, 9
  • [32] An Affordance Keypoint Detection Network for Robot Manipulation
    Xu, Ruinian
    Chu, Fu-Jen
    Tang, Chao
    Liu, Weiyu
    Vela, Patricio A.
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02): : 2870 - 2877
  • [33] Unsupervised Learning of Affordance Relations on a Humanoid Robot
    Akgun, Baris
    Dag, Nilguen
    Bilal, Tahir
    Atil, Ilkay
    Sahin, Erol
    2009 24TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2009, : 253 - 258
  • [34] Audio-visual annotation graphs for guiding lens-based scene exploration
    Ahsan, Moonisa
    Marton, Fabio
    Pintus, Ruggero
    Gobbetti, Enrico
    COMPUTERS & GRAPHICS-UK, 2022, 105 : 131 - 145
  • [35] Affordance-based robot object retrieval
    Thao Nguyen
    Gopalan, Nakul
    Patel, Roma
    Corsaro, Matt
    Pavlick, Ellie
    Tellex, Stefanie
    AUTONOMOUS ROBOTS, 2022, 46 (01) : 83 - 98
  • [36] A Robot Rehearses Internally and Learns an Affordance Relation
    Erdemir, Erdem
    Frankel, Carl B.
    Thornton, Sean
    Ulutas, Baris
    Kawamura, Kazuhiko
    2008 IEEE 7TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, 2008, : 298 - 303
  • [37] Affordance-based robot object retrieval
    Thao Nguyen
    Nakul Gopalan
    Roma Patel
    Matt Corsaro
    Ellie Pavlick
    Stefanie Tellex
    Autonomous Robots, 2022, 46 : 83 - 98
  • [38] Bid Prediction for Multi-Robot Exploration with Disrupted Communications
    Woosley, Bradley
    Nieto-Granda, Carlos
    Rogers, John G.
    Fung, Nicholas
    Schang, Arthur
    2021 IEEE INTERNATIONAL SYMPOSIUM ON SAFETY, SECURITY, AND RESCUE ROBOTICS (SSRR), 2021, : 210 - 216
  • [39] Robot studies on saccade-triggered visual prediction
    Schenck, Wolfram
    NEW IDEAS IN PSYCHOLOGY, 2013, 31 (03) : 221 - 238
  • [40] Learning Visual Affordances of Objects and Tools through Autonomous Robot Exploration
    Goncalves, Afonso
    Saponaro, Giovanni
    Jamone, Lorenzo
    Bernardino, Alexandre
    2014 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC), 2014, : 128 - 133