Learning Task Specifications from Demonstrations

被引:0
|
作者
Vazquez-Chanlatte, Marcell [1 ]
Jha, Susmit [2 ]
Tiwari, Ashish [2 ]
Ho, Mark K. [1 ]
Seshia, Sanjit A. [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] SRI Int, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷
基金
美国国家科学基金会;
关键词
MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-world applications often naturally decompose into several sub-tasks. In many settings (e.g., robotics) demonstrations provide a natural way to specify the sub-tasks. However, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the sub-tasks can be safely recombined or limit the types of composition available. Motivated by this deficit, we consider the problem of inferring Boolean non-Markovian rewards (also known as logical trace properties or specifications) from demonstrations provided by an agent operating in an uncertain, stochastic environment. Crucially, specifications admit well-defined composition rules that are typically easy to interpret. In this paper, we formulate the specification inference task as a maximum a posteriori (MAP) probability inference problem, apply the principle of maximum entropy to derive an analytic demonstration likelihood model and give an efficient approach to search for the most likely specification in a large candidate pool of specifications. In our experiments, we demonstrate how learning specifications can help avoid common problems that often arise due to ad-hoc reward composition.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Prediction of noise of commercial aircraft based on itself specifications by using machine learning methods
    Toraman, Suat
    Dursun, Omer Osman
    Aygun, Hakan
    JOURNAL OF AIR TRANSPORT MANAGEMENT, 2025, 125
  • [42] Instructional scaffolds for learning from formative peer assessment: effects of core task, peer feedback, and dialogue
    Deiglmayr, Anne
    EUROPEAN JOURNAL OF PSYCHOLOGY OF EDUCATION, 2018, 33 (01) : 185 - 198
  • [43] Suturing Tasks Automation Based on Skills Learned From Demonstrations: A Simulation Study
    Zhou, Haoying
    Jiang, Yiwei
    Gao, Shang
    Wang, Shiyue
    Kazanzides, Peter
    Fischer, Gregory S.
    2024 INTERNATIONAL SYMPOSIUM ON MEDICAL ROBOTICS, ISMR 2024, 2024,
  • [44] Effect of spatial distance to the task stimulus on task-irrelevant perceptual learning of static Gabors
    Nishina, Shigeaki
    Seitz, Aaron R.
    Kawato, Mitsuo
    Watanabe, Takeo
    JOURNAL OF VISION, 2007, 7 (13):
  • [45] Extracting Conceptual Data Specifications from Legacy Information Systems
    Paradauskas, B.
    Laurikaitis, A.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2011, (01) : 46 - 50
  • [46] A Multi-Task Learning Formulation for Survival Analysis
    Li, Yan
    Wang, Jie
    Ye, Jieping
    Reddy, Chandan K.
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1715 - 1724
  • [47] Neural multi-task learning in drug design
    Allenspach, Stephan
    Hiss, Jan A.
    Schneider, Gisbert
    NATURE MACHINE INTELLIGENCE, 2024, 6 (02) : 124 - 137
  • [48] Explanation recruits comparison in a category-learning task
    Edwards, Brian J.
    Williams, Joseph J.
    Gentner, Dedre
    Lombrozo, Tania
    COGNITION, 2019, 185 : 21 - 38
  • [49] Learning inverse kinematics problem in changing task environment
    Valaitis, Vytautas
    TWELFTH SCANDINAVIAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (SCAI 2013), 2013, 257 : 299 - 302
  • [50] The Effect of Task Fidelity on Learning Curves: A Synthetic Analysis
    Ritter, Frank E.
    Yeh, Martin K.
    Stager, Sarah J.
    McDermott, Ashley F.
    Weyhrauch, Peter W.
    INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2023, 39 (11) : 2253 - 2267