A tutorial on value function approximation for stochastic and dynamic transportation

被引:0
|
作者
Heinold, Arne [1 ]
机构
[1] Univ Kiel, Sch Econ & Business, Kiel, Germany
来源
4OR-A QUARTERLY JOURNAL OF OPERATIONS RESEARCH | 2024年 / 22卷 / 01期
关键词
Tutorial; Markov decision process; Approximate dynamic programming; Value function approximation; Reinforcement learning; OPTIMIZATION;
D O I
10.1007/s10288-023-00539-3
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This paper provides an introductory tutorial on Value Function Approximation (VFA), a solution class from Approximate Dynamic Programming. VFA describes a heuristic way for solving sequential decision processes like a Markov Decision Process. Real-world problems in supply chain management (and beyond) containing dynamic and stochastic elements might be modeled as such processes, but large-scale instances are intractable to be solved to optimality by enumeration due to the curses of dimensionality. VFA can be a proper method for these cases and this tutorial is designed to ease its use in research, practice, and education. For this, the tutorial describes VFA in the context of stochastic and dynamic transportation and makes three main contributions. First, it gives a concise theoretical overview of VFA's fundamental concepts, outlines a generic VFA algorithm, and briefly discusses advanced topics of VFA. Second, the VFA algorithm is applied to the taxicab problem that describes an easy-to-understand transportation planning task. Detailed step-by-step results are presented for a small-scale instance, allowing readers to gain an intuition about VFA's main principles. Third, larger instances are solved by enhancing the basic VFA algorithm demonstrating its general capability to approach more complex problems. The experiments are done with artificial instances and the respective Python scripts are part of an electronic appendix. Overall, the tutorial provides the necessary knowledge to apply VFA to a wide range of stochastic and dynamic settings and addresses likewise researchers, lecturers, tutors, students, and practitioners.
引用
收藏
页码:145 / 173
页数:29
相关论文
共 50 条
  • [1] A tutorial on value function approximation for stochastic and dynamic transportation
    Arne Heinold
    4OR, 2024, 22 : 145 - 173
  • [2] Primal-Dual Value Function Approximation for Stochastic Dynamic Intermodal Transportation with Eco-Labels
    Heinold, Arne
    Meisel, Frank
    Ulmer, Marlin W.
    TRANSPORTATION SCIENCE, 2023, 57 (06) : 1452 - 1472
  • [3] Adaptive value function approximation for continuous-state stochastic dynamic programming
    Fan, Huiyuan
    Tarun, Prashant K.
    Chen, Victoria C. P.
    COMPUTERS & OPERATIONS RESEARCH, 2013, 40 (04) : 1076 - 1084
  • [4] Controlled approximation of the value function in stochastic dynamic programming for multi-reservoir systems
    Zéphyr L.
    Lang P.
    Lamond B.F.
    Computational Management Science, 2015, 12 (4) : 539 - 557
  • [5] A UNIFIED FRAMEWORK FOR LINEAR FUNCTION APPROXIMATION OF VALUE FUNCTIONS IN STOCHASTIC CONTROL
    Sanchez-Fernandez, Matilde
    Valcarcel, Sergio
    Zazo, Santiago
    2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
  • [6] Geodesic Gaussian kernels for value function approximation
    Sugiyama, Masashi
    Hachiya, Hirotaka
    Towell, Christopher
    Vijayakumar, Sethu
    AUTONOMOUS ROBOTS, 2008, 25 (03) : 287 - 304
  • [7] Geodesic Gaussian kernels for value function approximation
    Masashi Sugiyama
    Hirotaka Hachiya
    Christopher Towell
    Sethu Vijayakumar
    Autonomous Robots, 2008, 25 : 287 - 304
  • [8] Optimized ensemble value function approximation for dynamic programming
    Cervellera, Cristiano
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 309 (02) : 719 - 730
  • [9] Dynamic Spectrum Anti-Jamming With Reinforcement Learning Based on Value Function Approximation
    Zhu, Xinyu
    Huang, Yang
    Wang, Shaoyu
    Wu, Qihui
    Ge, Xiaohu
    Liu, Yuan
    Gao, Zhen
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2023, 12 (02) : 386 - 390
  • [10] Meso-parametric value function approximation for dynamic customer acceptances in delivery routing
    Ulmer, Marlin W.
    Thomas, Barrett W.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2020, 285 (01) : 183 - 195