Interpreting Natural Language Instructions Using Language, Vision, and Behavior

被引:3
|
作者
Benotti, Luciana [1 ,2 ]
Lau, Tessa [3 ]
Villalba, Martin [1 ,4 ]
机构
[1] Univ Nacl Cordoba, Cordoba, Argentina
[2] Consejo Nacl Invest Cient & Tecn, Buenos Aires, DF, Argentina
[3] Savioke Inc, Sunnyvale, CA USA
[4] Univ Potsdam, D-14476 Potsdam, Germany
关键词
Design; Algorithms; Performance; Natural language interpretation; multimodal understanding; action recognition; visual feedback; situated virtual agent; unsupervised learning;
D O I
10.1145/2629632
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We define the problem of automatic instruction interpretation as follows. Given a natural language instruction, can we automatically predict what an instruction follower, such as a robot, should do in the environment to follow that instruction? Previous approaches to automatic instruction interpretation have required either extensive domain-dependent rule writing or extensive manually annotated corpora. This article presents a novel approach that leverages a large amount of unannotated, easy-to-collect data from humans interacting in a game-like environment. Our approach uses an automatic annotation phase based on artificial intelligence planning, for which two different annotation strategies are compared: one based on behavioral information and the other based on visibility information. The resulting annotations are used as training data for different automatic classifiers. This algorithm is based on the intuition that the problem of interpreting a situated instruction can be cast as a classification problem of choosing among the actions that are possible in the situation. Classification is done by combining language, vision, and behavior information. Our empirical analysis shows that machine learning classifiers achieve 77% accuracy on this task on available English corpora and 74% on similar German corpora. Finally, the inclusion of human feedback in the interpretation process is shown to boost performance to 92% for the English corpus and 90% for the German corpus.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] A Review of Service Robots Coping With Uncertain Information in Natural Language Instructions
    Muthugala, M. A. Viraj J.
    Jayasekara, A. G. Buddhika P.
    IEEE ACCESS, 2018, 6 : 12913 - 12928
  • [2] A plan-based agent architecture for interpreting natural language dialogue
    Ardissono, L
    Boella, G
    Lesmo, L
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2000, 52 (04) : 583 - 635
  • [3] Quantitative Topic Analysis of Materials Science Literature Using Natural Language Processing
    Choi, Jaewoong
    Lee, Byungju
    ACS APPLIED MATERIALS & INTERFACES, 2023, 16 (02) : 1957 - 1968
  • [4] Parallel processing of the target language during source language comprehension in interpreting
    Dong, Yanping
    Lin, Jiexuan
    BILINGUALISM-LANGUAGE AND COGNITION, 2013, 16 (03) : 682 - 692
  • [5] Computer Vision and Natural Language Processing: Recent Approaches in Multimedia and Robotics
    Wiriyathammabhum, Peratham
    Summers-Stay, Douglas
    Fermuller, Cornelia
    Aloimonos, Yiannis
    ACM COMPUTING SURVEYS, 2017, 49 (04)
  • [6] Providing tailored reflection instructions in collaborative learning using large language models
    Naik, Atharva
    Yin, Jessica Ruhan
    Kamath, Anusha
    Ma, Qianou
    Wu, Sherry Tongshuang
    Murray, R. Charles
    Bogart, Christopher
    Sakr, Majd
    Rose, Carolyn P.
    BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY, 2024,
  • [7] Shopping behavior recognition using a language modeling analogy
    Popa, M. C.
    Rothkrantz, L. J. M.
    Wiggers, P.
    Shan, C.
    PATTERN RECOGNITION LETTERS, 2013, 34 (15) : 1879 - 1889
  • [8] Generating the assembly instructions of helicopter subassemblies using the hierarchical pruning strategy and large language model
    Jiang, Mingjie
    Guo, Yu
    Huang, Shaohua
    Pu, Jun
    JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION, 2024, 42
  • [9] Measuring ethical behavior with AI and natural language processing to assess business success
    Gloor, Peter
    Fronzetti Colladon, Andrea
    Grippa, Francesca
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [10] Interacting with Traditional Chinese Culture through Natural Language
    Wang, Xuan
    Khoo, Eng Tat
    Nakatsu, Ryohei
    Cheok, Adrian
    ACM JOURNAL ON COMPUTING AND CULTURAL HERITAGE, 2014, 7 (03):