Learning Task Specifications from Demonstrations

被引：0

作者：

Vazquez-Chanlatte, Marcell ^{[1
]}

Jha, Susmit ^{[2
]}

Tiwari, Ashish ^{[2
]}

Ho, Mark K. ^{[1
]}

Seshia, Sanjit A. ^{[1
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] SRI Int, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | 2018年 / 31卷

基金：

美国国家科学基金会;

关键词：

MODEL;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Real-world applications often naturally decompose into several sub-tasks. In many settings (e.g., robotics) demonstrations provide a natural way to specify the sub-tasks. However, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the sub-tasks can be safely recombined or limit the types of composition available. Motivated by this deficit, we consider the problem of inferring Boolean non-Markovian rewards (also known as logical trace properties or specifications) from demonstrations provided by an agent operating in an uncertain, stochastic environment. Crucially, specifications admit well-defined composition rules that are typically easy to interpret. In this paper, we formulate the specification inference task as a maximum a posteriori (MAP) probability inference problem, apply the principle of maximum entropy to derive an analytic demonstration likelihood model and give an efficient approach to search for the most likely specification in a large candidate pool of specifications. In our experiments, we demonstrate how learning specifications can help avoid common problems that often arise due to ad-hoc reward composition.

引用

页数：11

共 50 条

[41] Prediction of noise of commercial aircraft based on itself specifications by using machine learning methods
Toraman, Suat
Dursun, Omer Osman
Aygun, Hakan
JOURNAL OF AIR TRANSPORT MANAGEMENT, 2025, 125
[42] Instructional scaffolds for learning from formative peer assessment: effects of core task, peer feedback, and dialogue
Deiglmayr, Anne
EUROPEAN JOURNAL OF PSYCHOLOGY OF EDUCATION, 2018, 33 (01) : 185 - 198
[43] Suturing Tasks Automation Based on Skills Learned From Demonstrations: A Simulation Study
Zhou, Haoying
Jiang, Yiwei
Gao, Shang
Wang, Shiyue
Kazanzides, Peter
Fischer, Gregory S.
2024 INTERNATIONAL SYMPOSIUM ON MEDICAL ROBOTICS, ISMR 2024, 2024,
[44] Effect of spatial distance to the task stimulus on task-irrelevant perceptual learning of static Gabors
Nishina, Shigeaki
Seitz, Aaron R.
Kawato, Mitsuo
Watanabe, Takeo
JOURNAL OF VISION, 2007, 7 (13):
[45] Extracting Conceptual Data Specifications from Legacy Information Systems
Paradauskas, B.
Laurikaitis, A.
ELEKTRONIKA IR ELEKTROTECHNIKA, 2011, (01) : 46 - 50
[46] A Multi-Task Learning Formulation for Survival Analysis
Li, Yan
Wang, Jie
Ye, Jieping
Reddy, Chandan K.
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1715 - 1724
[47] Neural multi-task learning in drug design
Allenspach, Stephan
Hiss, Jan A.
Schneider, Gisbert
NATURE MACHINE INTELLIGENCE, 2024, 6 (02) : 124 - 137
[48] Explanation recruits comparison in a category-learning task
Edwards, Brian J.
Williams, Joseph J.
Gentner, Dedre
Lombrozo, Tania
COGNITION, 2019, 185 : 21 - 38
[49] Learning inverse kinematics problem in changing task environment
Valaitis, Vytautas
TWELFTH SCANDINAVIAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (SCAI 2013), 2013, 257 : 299 - 302
[50] The Effect of Task Fidelity on Learning Curves: A Synthetic Analysis
Ritter, Frank E.
Yeh, Martin K.
Stager, Sarah J.
McDermott, Ashley F.
Weyhrauch, Peter W.
INTERNATIONAL JOURNAL OF HUMAN-COMPUTER INTERACTION, 2023, 39 (11) : 2253 - 2267

← 1 2 3 4 5 →