Dynamic parameters in Sequential Decision Making

被引:0
|
作者
Srivastava, Amber [1 ]
Salapaka, Srinivasa M. [2 ,3 ]
机构
[1] Swiss Fed Inst Technol, Swiss Fed Inst Technol, Automat Control Lab, Phys Str 3, CH-8092 Zurich, Switzerland
[2] Univ Illinois, Coordinated Sci Lab, Champaign, IL USA
[3] Univ Illinois, Mech Sci & Engn, Champaign, IL USA
基金
瑞士国家科学基金会;
关键词
Markov decision processes; Maximum Entropy Principle; Parameterized state and action spaces;
D O I
10.1016/j.automatica.2022.110795
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sequential Decision Making (SDM) problems optimize over the sequence of actions (or, decisions) taken to minimize the underlying cumulative cost. These sequence of actions are referred to as the policy of the SDM. Often these problems comprise of additional (fixed and manipulable) parameters; and the objective is to determine the optimal policy as well as the manipulable parameters that minimizes the SDM cost. In this paper we address the class of SDM problems that are characterized by dynamic parameters; where the dynamics is pre-specified for a subset of parameters and manipulable for others. The objective is to determine the manipulable parameter dynamics as well as the time-varying policy such that the associated SDM cost gets minimized at each time instant. To this end, we develop a control-theoretic framework to design the manipulable parameter dynamics such that it tracks the optimal values of the parameters, and simultaneously determines the time-varying optimal policy. Our methodology builds upon a Maximum Entropy Principle (MEP) based framework that addresses SDMs. More precisely, the above framework results into a smooth approximation of the SDM cost which we utilize as a control Lyapunov function. We show that under the resulting control law the parameters asymptotically track the local optimal, the proposed control law is Lipschitz continuous and bounded, and the policy of the SDM is optimal for a given set of parameter values. The simulations demonstrate the efficacy of our proposed methodology. (c) 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页数:8
相关论文
共 50 条
  • [1] SEQUENTIAL DECISION-MAKING - WALDS MODEL AND ESTIMATES OF PARAMETERS
    BECKER, GM
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1958, 55 (06): : 628 - 636
  • [2] Dynamics of sequential decision making
    Rabinovich, Mikhail I.
    Huerta, Ramon
    Afraimovich, Valentin
    PHYSICAL REVIEW LETTERS, 2006, 97 (18)
  • [3] Possibilistic sequential decision making
    Ben Amor, Nahla
    Fargier, Helene
    Guezguez, Wided
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2014, 55 (05) : 1269 - 1300
  • [4] Modeling and analysis of dynamic decision making in sequential two-choice tasks
    Vu, Linh
    Morgansen, Kristi A.
    47TH IEEE CONFERENCE ON DECISION AND CONTROL, 2008 (CDC 2008), 2008, : 1121 - 1126
  • [5] A SEQUENTIAL DECISION MAKING APPROACH TO LEARN PROCESS PARAMETERS BY CONDUCTING EXPERIMENTS ON SACRIFICIAL OBJECTS
    Yoon, Yeo Jung
    Gupta, Satyandra K.
    PROCEEDINGS OF ASME 2022 INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, IDETC-CIE2022, VOL 2, 2022,
  • [6] On Adaptivity and Safety in Sequential Decision Making
    Chaudhary, Sapana
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 7077 - 7078
  • [7] Learning to Generalize for Sequential Decision Making
    Yin, Xusen
    Weischedel, Ralph
    May, Jonathan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [8] SEQUENTIAL DECISION MAKING AND REVISION OF OPINION
    PITZ, GF
    BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1970, 23 (80): : 240 - 240
  • [9] Algorithms for Fairness in Sequential Decision Making
    Wen, Min
    Bastani, Osbert
    Topcu, Ufuk
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [10] Sequential Decision Making for Elevator Control
    Tartan, Emre Oner
    Ciflikli, Cebrail
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2023, 14 (05) : 1124 - 1131