Online model-based reinforcement learning for decision-making in long distance routes

被引：2

作者：

Alcaraz, Juan J. ^{[1
]}

Losilla, Fernando ^{[1
]}

Caballero-Arnaldos, Luis ^{[1
]}

机构：

[1] Tech Univ Cartagena UPCT, Dept Informat & Commun Technol, Cartagena, Spain

来源：

TRANSPORTATION RESEARCH PART E-LOGISTICS AND TRANSPORTATION REVIEW | 2022年 / 164卷

关键词：

Route scheduling; Reinforcement learning; Model predictive control; Monte Carlo tree search; VEHICLE-ROUTING PROBLEM; TIME WINDOWS; STOCHASTIC TRAVEL; OPTIMIZATION; FRAMEWORK; SERVICE;

D O I：

10.1016/j.tre.2022.102790

中图分类号：

F [经济];

学科分类号：

02 ;

摘要：

In road transportation, long-distance routes require scheduled driving times, breaks, and restperiods, in compliance with the regulations on working conditions for truck drivers, whileensuring goods are delivered within the time windows of each customer. However, routes aresubject to uncertain travel and service times, and incidents may cause additional delays, makingpredefined schedules ineffective in many real-life situations. This paper presents a reinforcementlearning (RL) algorithm capable of making en-route decisions regarding driving times, breaks,and rest periods, under uncertain conditions. Our proposal aims at maximizing the likelihood ofon-time delivery while complying with drivers' work regulations. We use an online model-basedRL strategy that needs no prior training and is more flexible than model-free RL approaches,where the agent must be trained offline before making online decisions. Our proposal combinesmodel predictive control with a rollout strategy and Monte Carlo tree search. At each decisionstage, our algorithm anticipates the consequences of all the possible decisions in a number offuture stages (the lookahead horizon), and then uses a base policy to generate a sequence ofdecisions beyond the lookahead horizon. This base policy could be, for example, a set of decisionrules based on the experience and expertise of the transportation company covering the routes.Our numerical results show that the policy obtained using our algorithm outperforms not onlythe base policy (up to 83%), but also a policy obtained offline using deep Q networks (DQN),a state-of-the-art, model-free RL algorithm.

引用

页数：21

共 50 条

[1] Reinforcement learning applied to a situation awareness decision-making model
Costa, Renato D.
Hirata, Celso M.
INFORMATION SCIENCES, 2025, 704
[2] Intrusion Response Decision-making Method Based on Reinforcement Learning
Yang, Jun-nan
Zhang, Hong-qi
Zhang, Chuan-fu
2018 INTERNATIONAL CONFERENCE ON COMMUNICATION, NETWORK AND ARTIFICIAL INTELLIGENCE (CNAI 2018), 2018, : 154 - 162
[3] Research on Decision-Making in Emotional Agent Based on Reinforcement Learning
Feng Chao
Chen Lin
Jiang Kui
Wei Zhonglin
Zhai Bing
2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 1191 - 1194
[4] REINFORCEMENT LEARNING FOR DECISION-MAKING IN A BUSINESS SIMULATOR
Garcia, Javier
Borrajo, Fernando
Fernandez, Fernando
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2012, 11 (05) : 935 - 960
[5] A DECISION-MAKING METHOD FOR AUTONOMOUS VEHICLES BASED ON SIMULATION AND REINFORCEMENT LEARNING
Zheng, Rui
Liu, Chunming
Guo, Qi
PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 362 - 369
[6] An integrated model for coordinating adaptive platoons and parking decision-making based on deep reinforcement learning
Li, Jia
Guo, Zijian
Jiang, Ying
Wang, Wenyuan
Li, Xin
COMPUTERS & INDUSTRIAL ENGINEERING, 2025, 203
[7] Reinforcement Learning-Based Intelligent Decision-Making for Communication Parameters
Xie, Xia
Dou, Zheng
Zhang, Yabin
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (09): : 2942 - 2960
[8] Reinforcement Learning based Lane Change Decision-Making with Imaginary Sampling
Li, Dong
Zhao, Dongbin
Zhang, Qichao
2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 16 - 21
[9] Reinforcement Learning Based Overtaking Decision-Making for Highway Autonomous Driving
Li, Xin
Xu, Xin
Zuo, Lei
2015 SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2015, : 336 - 342
[10] A Decision-Making System for Cotton Irrigation Based on Reinforcement Learning Strategy
Chen, Yi
Yu, Zhuo
Han, Zhenxiang
Sun, Weihong
He, Liang
AGRONOMY-BASEL, 2024, 14 (01):

← 1 2 3 4 5 →