A Reinforcement Learning and Prediction-Based Lookahead Policy for Vehicle Repositioning in Online Ride-Hailing Systems

被引：4

作者：

Wei, Honghao ^{[1
]}

Yang, Zixian ^{[2
]}

Liu, Xin ^{[3
]}

Qin, Zhiwei ^{[4
,5
]}

Tang, Xiaocheng ^{[4
,6
]}

Ying, Lei ^{[2
]}

机构：

[1] Washington State Univ, EECS, Pullman, WA 99164 USA

[2] Univ Michigan, EECS, Ann Arbor, MI 48109 USA

[3] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai 201210, Peoples R China

[4] Lab DiDi, Mountain View, CA 94043 USA

[5] Lyft Rideshare Labs, San Francisco, CA 94107 USA

[6] Meta AI, San Francisco, CA 94102 USA

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年 / 25卷 / 02期

关键词：

Ride-hailing; large-scale; reposition; idle car routing; reinforcement learning;

D O I：

10.1109/TITS.2023.3312048

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Existing approaches for vehicle repositioning on large-scale ride-hailing platforms either ignore the spatial -temporal mismatch between supply and demand in real-time or overlook the long-term balance of the system. To account for both, we propose a lookahead repositioning policy in this paper, which is a novel approach to repositioning idle vehicles from both a dynamic system and a long-term performance perspective. Our method consists of two parts; the first part utilizes linear programming (LP) to formulate the nonstationary system as a time-varying, T-step lookahead optimization problem and explicitly models the fraction of drivers who follow repositioning recommendations (called the repositioning rate). The second step is to incorporate a reinforcement learning (RL) method to maximize long-term return based on learned value functions after the T time slots. Extensive studies utilizing a real-world dataset on both small-scale and large-scale simulators show that our method outperforms previous baseline methods and is robust to prediction errors.

引用

页码：1846 / 1856

页数：11

共 20 条

[1] Pricing and Prioritizing Time-Sensitive Customers with Heterogeneous Demand Rates [J].