A Survey of Embodied AI: From Simulators to Research Tasks

被引:148
作者
Duan, Jiafei [1 ]
Yu, Samson [2 ]
Tan, Hui Li [3 ]
Zhu, Hongyuan [3 ]
Tan, Cheston [3 ]
机构
[1] Nanyang Technol Univ Singapore, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Singapore Univ Technol & Design, Singapore 487372, Singapore
[3] ASTAR, Inst Infocomm Res, Singapore 138632, Singapore
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2022年 / 6卷 / 02期
基金
新加坡国家研究基金会;
关键词
Artificial intelligence; Task analysis; Navigation; Physics; Three-dimensional displays; Visualization; Solid modeling; Embodied AI; computer vision; 3D simulators; NAVIGATION;
D O I
10.1109/TETCI.2022.3141105
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There has been an emerging paradigm shift from the era of "internet AI" to "embodied AI," where AI algorithms and agents no longer learn from datasets of images, videos or text curated primarily from the internet. Instead, they learn through interactions with their environments from an egocentric perception similar to humans. Consequently, there has been substantial growth in the demand for embodied AI simulators to support various embodied AI research tasks. This growing interest in embodied AI is beneficial to the greater pursuit of Artificial General Intelligence (AGI), but there has not been a contemporary and comprehensive survey of this field. This paper aims to provide an encyclopedic survey for the field of embodied AI, from its simulators to its research. By evaluating nine current embodied AI simulators with our proposed seven features, this paper aims to understand the simulators in their provision for use in embodied AI research and their limitations. Lastly, this paper surveys the three main research tasks in embodied AI - visual exploration, visual navigation and embodied question answering (QA), covering the state-of-the-art approaches, evaluation metrics and datasets. Finally, with the new insights revealed through surveying the field, the paper will provide suggestions for simulator-for-task selections and recommendations for the future directions of the field.
引用
收藏
页码:230 / 244
页数:15
相关论文
共 132 条
[1]  
Alicevision, 2018, BLEND 3D MOD REND PA
[2]  
Anderson P., 2018, On evaluation of embodied navigation agents
[3]   Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments [J].
Anderson, Peter ;
Wu, Qi ;
Teney, Damien ;
Bruce, Jake ;
Johnson, Mark ;
Sunderhauf, Niko ;
Reid, Ian ;
Gould, Stephen ;
van den Hengel, Anton .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3674-3683
[4]  
[Anonymous], 2016, CoRR
[5]  
[Anonymous], 2016, Openai gym
[6]  
Armeni I., 2017, CoRR abs/1702.01105
[7]  
Atkinson, 2003, INTRO MODERN PHOTOGR, V19
[8]  
Barreto A, 2017, ADV NEUR IN, V30
[9]  
Batra Dhruv, 2020, Objectnav revisited: On evaluation of embodied agents navigating to objects, V2, P7
[10]  
Bear D. M., 2021, ARXIV210608261