Interpreting Natural Language Instructions Using Language, Vision, and Behavior

被引:3
|
作者
Benotti, Luciana [1 ,2 ]
Lau, Tessa [3 ]
Villalba, Martin [1 ,4 ]
机构
[1] Univ Nacl Cordoba, Cordoba, Argentina
[2] Consejo Nacl Invest Cient & Tecn, Buenos Aires, DF, Argentina
[3] Savioke Inc, Sunnyvale, CA USA
[4] Univ Potsdam, D-14476 Potsdam, Germany
关键词
Design; Algorithms; Performance; Natural language interpretation; multimodal understanding; action recognition; visual feedback; situated virtual agent; unsupervised learning;
D O I
10.1145/2629632
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We define the problem of automatic instruction interpretation as follows. Given a natural language instruction, can we automatically predict what an instruction follower, such as a robot, should do in the environment to follow that instruction? Previous approaches to automatic instruction interpretation have required either extensive domain-dependent rule writing or extensive manually annotated corpora. This article presents a novel approach that leverages a large amount of unannotated, easy-to-collect data from humans interacting in a game-like environment. Our approach uses an automatic annotation phase based on artificial intelligence planning, for which two different annotation strategies are compared: one based on behavioral information and the other based on visibility information. The resulting annotations are used as training data for different automatic classifiers. This algorithm is based on the intuition that the problem of interpreting a situated instruction can be cast as a classification problem of choosing among the actions that are possible in the situation. Classification is done by combining language, vision, and behavior information. Our empirical analysis shows that machine learning classifiers achieve 77% accuracy on this task on available English corpora and 74% on similar German corpora. Finally, the inclusion of human feedback in the interpretation process is shown to boost performance to 92% for the English corpus and 90% for the German corpus.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] English to Tamil machine translation system using universal networking language
    Rajeswari Sridhar
    Pavithra Sethuraman
    Kashyap Krishnakumar
    Sādhanā, 2016, 41 : 607 - 620
  • [32] English to Tamil machine translation system using universal networking language
    Sridhar, Rajeswari
    Sethuraman, Pavithra
    Krishnakumar, Kashyap
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2016, 41 (06): : 607 - 620
  • [33] Unsupervised Prototype Adapter for Vision-Language Models
    Zhang, Yi
    Zhang, Ce
    Hu, Xueting
    He, Zhihai
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 197 - 209
  • [34] MIND MAPS AS AN EFFECTIVE METHOD OF INTERPRETING TEXTS IN FOREIGN LANGUAGE LEARNING
    Gazizova, L.
    14TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE (INTED2020), 2020, : 8842 - 8850
  • [35] Turning Words Into Numbers: Assessing Work Attitudes Using Natural Language Processing
    Speer, Andrew B.
    Perrotta, James
    Tenbrink, Andrew P.
    Wegmeyer, Lauren J.
    Delacruz, Angie Y.
    Bowker, Jenna
    JOURNAL OF APPLIED PSYCHOLOGY, 2023, 108 (06) : 1027 - 1045
  • [36] Concept Relation Extraction from Construction Documents Using Natural Language Processing
    Al Qady, Mohammed
    Kandil, Amr
    JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2010, 136 (03) : 294 - 302
  • [37] Accident Case Retrieval and Analyses: Using Natural Language Processing in the Construction Industry
    Kim, Taekhyung
    Chi, Seokho
    JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2019, 145 (03)
  • [38] COVID-19 Vaccine Infodemiology using Unsupervised Natural Language Processing
    Shakeri, Esmaeil
    Slama, Anja
    Souza, Roberto
    Far, Behrouz
    2022 IEEE 23RD INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2022), 2022, : 178 - 183
  • [39] Evaluating Chess-Like Games Using Generated Natural Language Descriptions
    Kowalski, Jakub
    Zarczynski, Lukasz
    Kisielewicz, Andrzej
    ADVANCES IN COMPUTER GAMES, ACG 2017, 2017, 10664 : 127 - 139
  • [40] Exploring chemical space using natural language processing methodologies for drug discovery
    Ozturk, Hakime
    Ozgur, Arzucan
    Schwaller, Philippe
    Laino, Teodoro
    Ozkirimli, Elif
    DRUG DISCOVERY TODAY, 2020, 25 (04) : 689 - 705