A tutorial on value function approximation for stochastic and dynamic transportation

被引:0
作者
Heinold, Arne [1 ]
机构
[1] Univ Kiel, Sch Econ & Business, Kiel, Germany
来源
4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH | 2024年 / 22卷 / 01期
关键词
Tutorial; Markov decision process; Approximate dynamic programming; Value function approximation; Reinforcement learning; OPTIMIZATION;
D O I
10.1007/s10288-023-00539-3
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper provides an introductory tutorial on Value Function Approximation (VFA), a solution class from Approximate Dynamic Programming. VFA describes a heuristic way for solving sequential decision processes like a Markov Decision Process. Real-world problems in supply chain management (and beyond) containing dynamic and stochastic elements might be modeled as such processes, but large-scale instances are intractable to be solved to optimality by enumeration due to the curses of dimensionality. VFA can be a proper method for these cases and this tutorial is designed to ease its use in research, practice, and education. For this, the tutorial describes VFA in the context of stochastic and dynamic transportation and makes three main contributions. First, it gives a concise theoretical overview of VFA's fundamental concepts, outlines a generic VFA algorithm, and briefly discusses advanced topics of VFA. Second, the VFA algorithm is applied to the taxicab problem that describes an easy-to-understand transportation planning task. Detailed step-by-step results are presented for a small-scale instance, allowing readers to gain an intuition about VFA's main principles. Third, larger instances are solved by enhancing the basic VFA algorithm demonstrating its general capability to approach more complex problems. The experiments are done with artificial instances and the respective Python scripts are part of an electronic appendix. Overall, the tutorial provides the necessary knowledge to apply VFA to a wide range of stochastic and dynamic settings and addresses likewise researchers, lecturers, tutors, students, and practitioners.
引用
收藏
页码:145 / 173
页数:29
相关论文
共 50 条
  • [41] A grey approximation approach to state value function in reinforcement learning
    Hwang, Kao-Shing
    Chen, Yu-Jen
    Lee, Guar-Yuan
    2007 IEEE INTERNATIONAL CONFERENCE ON INTEGRATION TECHNOLOGY, PROCEEDINGS, 2007, : 379 - +
  • [42] Local and soft feature selection for value function approximation in batch reinforcement learning for robot navigation
    Fathinezhad, Fatemeh
    Adibi, Peyman
    Shoushtarian, Bijan
    Chanussot, Jocelyn
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (08) : 10720 - 10745
  • [43] Integrating Symmetry of Environment by Designing Special Basis functions for Value Function Approximation in Reinforcement Learning
    Wang, Guo-fang
    Fang, Zhou
    Li, Bo
    Li, Ping
    2016 14TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV), 2016,
  • [44] Local and soft feature selection for value function approximation in batch reinforcement learning for robot navigation
    Fatemeh Fathinezhad
    Peyman Adibi
    Bijan Shoushtarian
    Jocelyn Chanussot
    The Journal of Supercomputing, 2024, 80 : 10720 - 10745
  • [45] Power System Maintenance Planning Using Value Function Approximation
    Abeygunawardane, Saranga K.
    Jirutitijaroen, Panida
    Xu, Huan
    2014 INTERNATIONAL CONFERENCE ON PROBABILISTIC METHODS APPLIED TO POWER SYSTEMS (PMAPS), 2014,
  • [46] Tutorial on Stochastic Optimization in Energy-Part I: Modeling and Policies
    Powell, Warren B.
    Meisel, Stephan
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2016, 31 (02) : 1459 - 1467
  • [47] DYNAMIC WEB FOR MANAGEMENT TUTORIAL
    Ruiz-Ruiz, J. F.
    Garcia-Munoz, M. A.
    Jodar-Reyes, J.
    Lopez-Moreno, A.
    Ordonez-Canada, C.
    EDULEARN16: 8TH INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES, 2016, : 1539 - 1547
  • [48] Two-stage stochastic approximation for dynamic rebalancing of shared mobility systems
    Warrington, Joseph
    Ruchti, Dominik
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2019, 104 : 110 - 134
  • [49] A dynamic mission abort policy for transportation systems with stochastic dependence by deep reinforcement learning
    Liu, Lujie
    Yang, Jun
    Yan, Bingxin
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2024, 241
  • [50] Tutorial on Stochastic Optimization in Energy-Part II: An Energy Storage Illustration
    Powell, Warren B.
    Meisel, Stephan
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2016, 31 (02) : 1468 - 1475