Behaviour Learning with Adaptive Motif Discovery and Interacting Multiple Model

被引：0

作者：

Zhao, Hanqing ^{[1
]}

Manderson, Travis ^{[1
]}

Zhang, Hao ^{[2
]}

Liu, Xue ^{[1
]}

Dudek, Gregory ^{[1
]}

机构：

[1] McGill Univ, Sch Comp Sci, Montreal, PQ, Canada

[2] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China

来源：

2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2022年

基金：

加拿大自然科学与工程研究理事会;

关键词：

ALGORITHM;

D O I：

10.1109/IROS47612.2022.9981588

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We propose an approach that enables simultaneous interpretable learning of a high-level discrete behaviour and its low-level rhythmic sub-behaviour. We do this though a unified reward function, where a reward function that only describes low-level behaviour, with less impact on learning of other behaviours is recovered from few-shot motion demonstrations. To this end, we first extract local behaviour motifs from state-only human demonstrations and random driving samples using an adaptive motif discovery approach derived from the Matrix Profile algorithm. We then optimize parameters for motif discovery by maximizing the sum and entropy over motif sizes. Interacting Multiple Model (IMM) estimators are constructed on top of linear-Gaussian dynamics of discovered motifs, the cumulative distributions over motifs estimated by IMMs serve as the basis of the reward function. By combining the recovered reward with the terrain type signal gathered from the environment, we are able to train a dual-objective off-road vehicle controller that demonstrates both terrain selection and human-like driving behaviours. Compared with related approaches across 10 people, our rhythmic behaviour reward recovery approach enables the controller to produce higher preference over human driving demonstrations. In addition to performing more stable across different people with 87% less variance than the best baseline in rhythmic behaviour indicator, our method reduces the negative effects on higher-level behaviour learning while maintaining high interpretability at all stages of the algorithm.

引用

页码：10788 / 10794

页数：7

共 50 条

[1] Multiple adaptive factors based interacting multiple model estimator
Sun, Minxing
Duan, Qianwen
Xia, Wanrun
Bao, Qiliang
Mao, Yao
IET CONTROL THEORY AND APPLICATIONS, 2024, 18 (08) : 1059 - 1069
[2] Mining Network Motif Discovery by Learning Techniques
Mursa, Bogdan-Eduard-Madalin
Andreica, Anca
Diosan, Laura
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 : 73 - 84
[3] An adaptive approach for estimation of transition probability matrix in the interacting multiple model filter
Cosme, Luciana Balieiro
Silveira Vasconcelos D'Angelo, Marcos Flavio
Caminhas, Walmir Matos
Camargos, Murilo Osorio
Palhares, Reinaldo Martinez
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (01) : 155 - 166
[4] A new adaptive control scheme based on the interacting multiple model (IMM) estimation
Afshari, Hamed H.
Al-Ani, Dhafar
Habibi, Saeid
JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2016, 30 (06) : 2759 - 2767
[5] Interacting Multiple Model Estimation-based Adaptive Robust Unscented Kalman Filter
Gao, Bingbing
Gao, Shesheng
Zhong, Yongmin
Hu, Gaoge
Gu, Chengfan
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2017, 15 (05) : 2013 - 2025
[6] Prognostics by Interacting Multiple Model Estimator
Yan, Yanjun
Mallick, Mahendra
Hang, James Z.
Liu, Jie
2016 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT (ICPHM), 2016,
[7] Shipborne radar maneuvering target tracking based on the variable structure adaptive grid interacting multiple model
Zhu, Zheng-wei
JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2013, 14 (09): : 733 - 742
[8] RL-MD: A Novel Reinforcement Learning Approach for DNA Motif Discovery
Wang, Wen
Wang, Jianzong
Si, Shijing
Huang, Zhangcheng
Xiao, Jing
2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 297 - 303
[9] State of charge estimation of li-ion batteries based on the noise-adaptive interacting multiple model
Huang, Ce
Yu, Xiaoyang
Wang, Yongchao
Zhou, Yongqin
Li, Ran
ENERGY REPORTS, 2021, 7 : 8152 - 8161
[10] Adaptive interacting multiple model for underwater maneuvering target tracking with one-step randomly delayed measurements
Li, Xiaohua
Lu, Bo
Li, Yuxing
Lu, Xiaofeng
Jin, Haiyan
OCEAN ENGINEERING, 2023, 280

← 1 2 3 4 5 →