Optimal Control of Logically Constrained Partially Observable and Multiagent Markov Decision Processes

被引：0

作者：

Kalagarla, Krishna C. ^{[1
,2
]}

Kartik, Dhruva ^{[1
,3
]}

Shen, Dongming ^{[1
,4
]}

Jain, Rahul ^{[1
]}

Nayyar, Ashutosh ^{[1
]}

Nuzzo, Pierluigi ^{[1
]}

机构：

[1] Univ Southern Calif, Ming Hsieh Dept Elect & Comp Engn, Los Angeles, CA 90089 USA

[2] Univ New Mexico, Elect & Comp Engn Dept, Albuquerque, NM 87106 USA

[3] Amazon, Seattle, WA 98121 USA

[4] MIT Sloan Sch Management, Cambridge, MA USA

来源：

IEEE TRANSACTIONS ON AUTOMATIC CONTROL | 2025年 / 70卷 / 01期

关键词：

Logic; Planning; Robots; Optimal control; Markov decision processes; Task analysis; Stochastic processes; Markov decision processes (MDPs); multiagent systems; partially observable Markov decision processes (POMDPs); stochastic optimal control; temporal logic;

D O I：

10.1109/TAC.2024.3422213

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Autonomous systems often have logical constraints arising, for example, from safety, operational, or regulatory requirements. Such constraints can be expressed using temporal logic specifications. The system state is often partially observable. Moreover, it could encompass a team of multiple agents with a common objective but disparate information structures and constraints. In this article, we first introduce an optimal control theory for partially observable Markov decision processes with finite linear temporal logic constraints. We provide a structured methodology for synthesizing policies that maximize a cumulative reward while ensuring that the probability of satisfying a temporal logic constraint is sufficiently high. Our approach comes with guarantees on approximate reward optimality and constraint satisfaction. We then build on this approach to design an optimal control framework for logically constrained multiagent settings with information asymmetry. We illustrate the effectiveness of our approach by implementing it on several case studies.

引用

页码：263 / 277

页数：15

共 50 条

[21] Markov decision processes based optimal control policies for probabilistic boolean networks [J].

Abul, O ;

Alhajj, R ;

Polat, F .

BIBE 2004: FOURTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2004, :337-344

[22] Partially observable Markov decision model for the treatment of early Prostate Cancer [J].

Goulionis J.E. ;

Koutsiumaris B.K. .

OPSEARCH, 2010, 47 (2) :105-117

[23] Guided Soft Actor Critic: A Guided Deep Reinforcement Learning Approach for Partially Observable Markov Decision Processes [J].

Haklidir, Mehmet ;

Temeltas, Hakan .

IEEE ACCESS, 2021, 9 :159672-159683

[24] Observer and control design in partially observable finite Markov chains [J].

Clempner, Julio B. ;

Poznyak, Alexander S. .

AUTOMATICA, 2019, 110

[25] Robustness of policies in constrained Markov decision processes [J].

Zadorojniy, A ;

Shwartz, A .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (04) :635-638

[26] Detection-averse optimal and receding-horizon control for Markov decision processes [J].

Li, Nan ;

Kolmanovsky, Ilya ;

Girard, Anouck .

AUTOMATICA, 2020, 122

[27] Solving Multiagent Markov Decision Processes: A Forest Management Example [J].

Chades, Iadine ;

Bouteiller, Bertrand .

MODSIM 2005: INTERNATIONAL CONGRESS ON MODELLING AND SIMULATION: ADVANCES AND APPLICATIONS FOR MANAGEMENT AND DECISION MAKING: ADVANCES AND APPLICATIONS FOR MANAGEMENT AND DECISION MAKING, 2005, :1594-1600

[28] Experimental Design for Partially Observed Markov Decision Processes [J].

Thorbergsson, Leifur ;

Hooker, Giles .

SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION, 2018, 6 (02) :549-567

[29] Constrained Markov decision processes with first passage criteria [J].

Yonghui Huang ;

Qingda Wei ;

Xianping Guo .

Annals of Operations Research, 2013, 206 :197-219

[30] Constrained Markov decision processes with first passage criteria [J].

Huang, Yonghui ;

Wei, Qingda ;

Guo, Xianping .

ANNALS OF OPERATIONS RESEARCH, 2013, 206 (01) :197-219

← 1 2 3 4 5 →