Minimal Disclosure in Partially Observable Markov Decision Processes

被引：2

作者：

Bertrand, Nathalie ^{[1
]}

Genest, Blaise ^{[2
,3
]}

机构：

[1] INRIA Rennes Bretagne Atlantique, Rennes, France

[2] CNRS, UMI IPAL, NUS, Singapore, Singapore

[3] ASTAR, I2R, Singapore, Singapore

来源：

IARCS ANNUAL CONFERENCE ON FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE (FSTTCS 2011) | 2011年 / 13卷

关键词：

Partially Observable Markov Decision Processes; Stochastic Games; Model-Checking; Worst-Case/Average-Case Analysis; DIAGNOSIS; AUTOMATA; GAMES;

D O I：

10.4230/LIPIcs.FSTTCS.2011.411

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

For security and efficiency reasons, most systems do not give the users a full access to their information. One key specification formalism for these systems are the so called Partially Observable Markov Decision Processes (POMDP for short), which have been extensively studied in several research communities, among which AI and model-checking. In this paper we tackle the problem of the minimal information a user needs at runtime to achieve a simple goal, modeled as reaching an objective with probability one. More precisely, to achieve her goal, the user can at each step either choose to use the partial information, or pay a fixed cost and receive the full information. The natural question is then to minimize the cost the user needs to fulfill her objective. This optimization question gives rise to two different problems, whether we consider to minimize the worst case cost, or the average cost. On the one hand, concerning the worst case cost, we show that efficient techniques from the model checking community can be adapted to compute the optimal worst case cost and give optimal strategies for the users. On the other hand, we show that the optimal average price (a question typically considered in the AI community) cannot be computed in general, nor can it be approximated in polynomial time even up to a large approximation factor.

引用

页码：411 / 422

页数：12

共 50 条

[1] Partially Observable Markov Decision Processes and Robotics
Kurniawati, Hanna
ANNUAL REVIEW OF CONTROL ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 5 : 253 - 277
[2] A tutorial on partially observable Markov decision processes
Littman, Michael L.
JOURNAL OF MATHEMATICAL PSYCHOLOGY, 2009, 53 (03) : 119 - 125
[3] Quantum partially observable Markov decision processes
Barry, Jennifer
Barry, Daniel T.
Aaronson, Scott
PHYSICAL REVIEW A, 2014, 90 (03):
[4] PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES WITH PARTIALLY OBSERVABLE RANDOM DISCOUNT FACTORS
Martinez-Garcia, E. Everardo
Minjarez-Sosa, J. Adolfo
Vega-Amaya, Oscar
KYBERNETIKA, 2022, 58 (06) : 960 - 983
[5] Active learning in partially observable Markov decision processes
Jaulmes, R
Pineau, J
Precup, D
MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 601 - 608
[6] Structural Estimation of Partially Observable Markov Decision Processes
Chang, Yanling
Garcia, Alfredo
Wang, Zhide
Sun, Lu
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (08) : 5135 - 5141
[7] Nonapproximability results for partially observable Markov decision processes
Lusena, C
Goldsmith, J
Mundhenk, M
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 14 : 83 - 113
[8] Entropy Maximization for Partially Observable Markov Decision Processes
Savas, Yagiz
Hibbard, Michael
Wu, Bo
Tanaka, Takashi
Topcu, Ufuk
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) : 6948 - 6955
[9] Decentralized Control of Partially Observable Markov Decision Processes
Amato, Christopher
Chowdhary, Girish
Geramifard, Alborz
Uere, N. Kemal
Kochenderfer, Mykel J.
2013 IEEE 52ND ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2013, : 2398 - 2405
[10] On Anderson Acceleration for Partially Observable Markov Decision Processes
Ermis, Melike
Park, Mingyu
Yang, Insoon
2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 4478 - 4485

← 1 2 3 4 5 →