Scalable reinforcement learning approaches for dynamic pricing in ride-hailing systems

被引:11
作者
Lei, Zengxiang [1 ]
Ukkusuri, Satish V. [1 ]
机构
[1] Purdue Univ, Lyles Sch Civil Engn, W Lafayette, IN 47907 USA
关键词
Ride-hailing; Dynamic pricing; Reinforcement learning; MARKOV DECISION-PROCESS; MODEL; PLATFORMS;
D O I
10.1016/j.trb.2023.102848
中图分类号
F [经济];
学科分类号
02 ;
摘要
Dynamic pricing is a widely applied strategy by ride-hailing companies, such as Uber and Lyft, to match the trip demand with the availability of drivers. Deciding proper pricing policies challenging and existing reinforcement learning (RL)-based solutions are restricted in solving small-scale problems. In this study, we contribute to RL-based approaches that can address the dynamic pricing problem in real-world-scale ride-hailing systems. We first characterize the dynamic pricing problem with a clear distinction between historical prices and current prices. We then translate our dynamic pricing problem into Markov Decision Process (MDP) and prove the existence of a deterministic stationary optimal policy. Our solutions are based on an off-policy reinforcement learning algorithm called twin-delayed deep determinant policy gradient (TD3) that performs offline learning of the optimal pricing policy using historical data and applies the learned policy to the next time slot, e.g., one week. We enhance TD3 by creating three mechanisms to reduce our model complexity and enhance training effectiveness. Extensive numerical experiments are conducted on both small grid networks (16 zones) and the NYC network (242 zones) to demonstrate the performance of the proposed algorithm. The results show our algorithm can efficiently find the optimal pricing policy for both the small and large networks, and can significantly enhance the platform profit and service efficiency.
引用
收藏
页数:19
相关论文
共 55 条
[1]   On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment [J].
Alonso-Mora, Javier ;
Samaranayake, Samitha ;
Wallar, Alex ;
Frazzoli, Emilio ;
Rus, Daniela .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2017, 114 (03) :462-467
[2]   A partially observed Markov decision process for dynamic pricing [J].
Aviv, Y ;
Pazgal, A .
MANAGEMENT SCIENCE, 2005, 51 (09) :1400-1416
[3]   Coordinating Supply and Demand on an On-Demand Service Platform with Impatient Customers [J].
Bai, Jiaru ;
So, Kut C. ;
Tang, Christopher S. ;
Chen, Xiqun ;
Wang, Hai .
M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT, 2019, 21 (03) :556-570
[4]  
Banerjee S., 2015, Pricing in ride-share platforms: A queueingtheoretic approach, DOI DOI 10.2139/SSRN.2568258
[5]  
Bertsimas D, 2006, APPL OPTIMIZAT, V101, P45
[6]   Spatial Pricing in Ride-Sharing Networks [J].
Bimpikis, Kostas ;
Candogan, Ozan ;
Saban, Daniela .
OPERATIONS RESEARCH, 2019, 67 (03) :744-769
[7]   The Role of Surge Pricing on a Service Platform with Self-Scheduling Capacity [J].
Cachon, Gerard P. ;
Daniels, Kaitlin M. ;
Lobel, Ruben .
M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT, 2017, 19 (03) :368-384
[8]   Surge Pricing Solves the Wild Goose Chase [J].
Castillo, Juan Camilo ;
Knoepfle, Dan ;
Weyl, Glen .
EC'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON ECONOMICS AND COMPUTATION, 2017, :241-242
[9]   Spatial-temporal pricing for ride-sourcing platform with reinforcement learning [J].
Chen, Chuqiao ;
Yao, Fugen ;
Mo, Dong ;
Zhu, Jiangtao ;
Chen, Xiqun .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 130
[10]   InBEDE: Integrating Contextual Bandit with TD Learning for Joint Pricing and Dispatch of Ride-Hailing Platforms [J].
Chen, Haipeng ;
Jiao, Yan ;
Qin, Zhiwei ;
Tang, Xiaocheng ;
Li, Hao ;
An, Bo ;
Zhu, Hongtu ;
Ye, Jieping .
2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, :61-70