Optimal Control of Logically Constrained Partially Observable and Multiagent Markov Decision Processes

被引:0
|
作者
Kalagarla, Krishna C. [1 ,2 ]
Kartik, Dhruva [1 ,3 ]
Shen, Dongming [1 ,4 ]
Jain, Rahul [1 ]
Nayyar, Ashutosh [1 ]
Nuzzo, Pierluigi [1 ]
机构
[1] Univ Southern Calif, Ming Hsieh Dept Elect & Comp Engn, Los Angeles, CA 90089 USA
[2] Univ New Mexico, Elect & Comp Engn Dept, Albuquerque, NM 87106 USA
[3] Amazon, Seattle, WA 98121 USA
[4] MIT Sloan Sch Management, Cambridge, MA USA
关键词
Logic; Planning; Robots; Optimal control; Markov decision processes; Task analysis; Stochastic processes; Markov decision processes (MDPs); multiagent systems; partially observable Markov decision processes (POMDPs); stochastic optimal control; temporal logic;
D O I
10.1109/TAC.2024.3422213
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Autonomous systems often have logical constraints arising, for example, from safety, operational, or regulatory requirements. Such constraints can be expressed using temporal logic specifications. The system state is often partially observable. Moreover, it could encompass a team of multiple agents with a common objective but disparate information structures and constraints. In this article, we first introduce an optimal control theory for partially observable Markov decision processes with finite linear temporal logic constraints. We provide a structured methodology for synthesizing policies that maximize a cumulative reward while ensuring that the probability of satisfying a temporal logic constraint is sufficiently high. Our approach comes with guarantees on approximate reward optimality and constraint satisfaction. We then build on this approach to design an optimal control framework for logically constrained multiagent settings with information asymmetry. We illustrate the effectiveness of our approach by implementing it on several case studies.
引用
收藏
页码:263 / 277
页数:15
相关论文
共 50 条
  • [1] Partially Observable Markov Decision Processes in Robotics: A Survey
    Lauri, Mikko
    Hsu, David
    Pajarinen, Joni
    IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (01) : 21 - 40
  • [2] An Argument for the Bayesian Control of Partially Observable Markov Decision Processes
    Vargo, Erik
    Cogill, Randy
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (10) : 2796 - 2800
  • [3] Entropy Maximization for Partially Observable Markov Decision Processes
    Savas, Yagiz
    Hibbard, Michael
    Wu, Bo
    Tanaka, Takashi
    Topcu, Ufuk
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) : 6948 - 6955
  • [4] Partially observable Markov decision processes for spoken dialog systems
    Williams, Jason D.
    Young, Steve
    COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02): : 393 - 422
  • [5] What is decidable about partially observable Markov decision processes with ω-regular objectives
    Chatterjee, Krishnendu
    Chmelik, Martin
    Tracol, Mathieu
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2016, 82 (05) : 878 - 911
  • [6] Actual Causality and Responsibility Attribution in Decentralized Partially Observable Markov Decision Processes
    Triantafyllou, Stelios
    Singla, Adish
    Radanovic, Goran
    PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 739 - 752
  • [7] STRONG UNIFORM VALUE IN GAMBLING HOUSES AND PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES
    Venel, Xavier
    Ziliotto, Bruno
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2016, 54 (04) : 1983 - 2008
  • [8] Optimal control in light traffic Markov decision processes
    Ger Koole
    Olaf Passchier
    Mathematical Methods of Operations Research, 1997, 45 : 63 - 79
  • [9] Optimal control in light traffic Markov decision processes
    Koole, G
    Passchier, O
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 1997, 45 (01) : 63 - 79
  • [10] A Partially Observable Markov-Decision-Process-Based Blackboard Architecture for Cognitive Agents in Partially Observable Environments
    Itoh, Hideaki
    Nakano, Hidehiko
    Tokushima, Ryota
    Fukumoto, Hisao
    Wakuya, Hiroshi
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2022, 14 (01) : 189 - 204