Observation-Augmented Contextual Multi-Armed Bandits for Robotic Search and Exploration

被引:1
作者
Wakayama, Shohei [1 ]
Ahmed, Nisar [1 ]
机构
[1] Univ Colorado, Smead Aerosp Engn Sci Dept, Boulder, CO 80303 USA
来源
IEEE ROBOTICS AND AUTOMATION LETTERS | 2024年 / 9卷 / 10期
关键词
Robots; Semantics; Robot sensing systems; Uncertainty; Decision making; Probabilistic logic; Bayes methods; Probabilistic inference; human-robot collaboration; MIXTURE;
D O I
10.1109/LRA.2024.3448133
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
We introduce a new variant of contextual multi-armed bandits (CMABs) called observation-augmented CMABs (OA-CMABs) wherein a robot uses extra outcome observations from an external information source, e.g. humans. In OA-CMABs, external observations are a function of context features and thus provide evidence on top of observed option outcomes to infer hidden parameters. However, if external data is error-prone, measures must be taken to preserve the correctness of inference. To this end, we derive a robust Bayesian inference process for OA-CMABs based on recently developed probabilistic semantic data association techniques, which handle complex mixture model parameter priors and hybrid discrete-continuous observation likelihoods for semantic external data sources. To cope with combined uncertainties in OA-CMABs, we also derive a new active inference algorithm for optimal option selection based on approximate expected free energy minimization. This generalizes prior work on CMAB active inference by accounting for faulty observations and non-Gaussian distributions. Results for a simulated deep space search site selection problem show that, even if incorrect semantic observations are provided externally, e.g. by scientists, efficient decision-making and robust parameter inference are still achieved in a wide variety of conditions.
引用
收藏
页码:8531 / 8538
页数:8
相关论文
共 32 条
  • [1] Adam J. R., 2016, Tech. Rep.
  • [2] Data-Free/Data-Sparse Softmax Parameter Estimation With Structured Class Geometries
    Ahmed, Nisar
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (09) : 1408 - 1412
  • [3] Bayesian Multicategorical Soft Data Fusion for Human-Robot Collaboration
    Ahmed, Nisar R.
    Sample, Eric M.
    Campbell, Mark
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2013, 29 (01) : 189 - 206
  • [4] Finite-time analysis of the multiarmed bandit problem
    Auer, P
    Cesa-Bianchi, N
    Fischer, P
    [J]. MACHINE LEARNING, 2002, 47 (2-3) : 235 - 256
  • [5] The Probabilistic Data Association Filter ESTIMATION IN THE PRESENCE OF MEASUREMENT ORIGIN UNCERTAINTY
    Bar-Shalom, Yaakov
    Daum, Fred
    Huang, Jim
    [J]. IEEE CONTROL SYSTEMS MAGAZINE, 2009, 29 (06): : 82 - 100
  • [6] Bishop C.M., 2006, Pattern Recognition and Machine Learning
  • [7] Survey on Applications of Multi-Armed and Contextual Bandits
    Bouneffouf, Djallel
    Rish, Irina
    Aggarwal, Charu
    [J]. 2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
  • [8] OceanWATERS Lander Robotic Arm Operation
    Catanoso, Damiana
    Chakrabarty, Anjan
    Fugate, Jason
    Naal, Ussama
    Welsh, Terence M.
    Edwards, Laurence J.
    [J]. 2021 IEEE AEROSPACE CONFERENCE (AEROCONF 2021), 2021,
  • [9] The free-energy principle: a unified brain theory?
    Friston, Karl J.
    [J]. NATURE REVIEWS NEUROSCIENCE, 2010, 11 (02) : 127 - 138
  • [10] Gupta N., 2011, Proceedings of the 2011 Tenth International Conference on Machine Learning and Applications (ICMLA 2011), P484, DOI 10.1109/ICMLA.2011.144