Visual Affordance Prediction for Guiding Robot Exploration

被引：2

作者：

Bharadhwaj, Homanga ^{[1
]}

Gupta, Abhinav ^{[1
]}

Tulsiani, Shubham ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA | 2023年

关键词：

D O I：

10.1109/ICRA48891.2023.10161288

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Motivated by the intuitive understanding humans have about the space of possible interactions, and the ease with which they can generalize this understanding to previously unseen scenes, we develop an approach for learning 'visual affordances'. Given an input image of a scene, we infer a distribution over plausible future states that can be achieved via interactions with it. To allow predicting diverse plausible futures, we discretize the space of continuous images with a VQ-VAE and use a Transformer-based model to learn a conditional distribution in the latent embedding space. We show that these models can be trained using large-scale and diverse passive data, and that the learned models exhibit compositional generalization to diverse objects beyond the training distribution. We evaluate the quality and diversity of the generations, and demonstrate how the trained affordance model can be used for guiding exploration during visual goal-conditioned policy learning in robotic manipulation.

引用

页码：3029 / 3036

页数：8

共 50 条

[31] Effects of broken affordance on visual extinction
Wulff, Melanie
Humphreys, Glyn W.
FRONTIERS IN HUMAN NEUROSCIENCE, 2015, 9
[32] An Affordance Keypoint Detection Network for Robot Manipulation
Xu, Ruinian
Chu, Fu-Jen
Tang, Chao
Liu, Weiyu
Vela, Patricio A.
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (02): : 2870 - 2877
[33] Unsupervised Learning of Affordance Relations on a Humanoid Robot
Akgun, Baris
Dag, Nilguen
Bilal, Tahir
Atil, Ilkay
Sahin, Erol
2009 24TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2009, : 253 - 258
[34] Audio-visual annotation graphs for guiding lens-based scene exploration
Ahsan, Moonisa
Marton, Fabio
Pintus, Ruggero
Gobbetti, Enrico
COMPUTERS & GRAPHICS-UK, 2022, 105 : 131 - 145
[35] Affordance-based robot object retrieval
Thao Nguyen
Gopalan, Nakul
Patel, Roma
Corsaro, Matt
Pavlick, Ellie
Tellex, Stefanie
AUTONOMOUS ROBOTS, 2022, 46 (01) : 83 - 98
[36] A Robot Rehearses Internally and Learns an Affordance Relation
Erdemir, Erdem
Frankel, Carl B.
Thornton, Sean
Ulutas, Baris
Kawamura, Kazuhiko
2008 IEEE 7TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING, 2008, : 298 - 303
[37] Affordance-based robot object retrieval
Thao Nguyen
Nakul Gopalan
Roma Patel
Matt Corsaro
Ellie Pavlick
Stefanie Tellex
Autonomous Robots, 2022, 46 : 83 - 98
[38] Bid Prediction for Multi-Robot Exploration with Disrupted Communications
Woosley, Bradley
Nieto-Granda, Carlos
Rogers, John G.
Fung, Nicholas
Schang, Arthur
2021 IEEE INTERNATIONAL SYMPOSIUM ON SAFETY, SECURITY, AND RESCUE ROBOTICS (SSRR), 2021, : 210 - 216
[39] Robot studies on saccade-triggered visual prediction
Schenck, Wolfram
NEW IDEAS IN PSYCHOLOGY, 2013, 31 (03) : 221 - 238
[40] Learning Visual Affordances of Objects and Tools through Autonomous Robot Exploration
Goncalves, Afonso
Saponaro, Giovanni
Jamone, Lorenzo
Bernardino, Alexandre
2014 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC), 2014, : 128 - 133

← 1 2 3 4 5 →