Research on Dynamic Subsidy Based on Deep Reinforcement Learning for Non-Stationary Stochastic Demand in Ride-Hailing

被引:1
作者
Huang, Xiangyu [1 ]
Cheng, Yan [1 ]
Jin, Jing [1 ]
Kou, Aiqing [1 ]
机构
[1] East China Univ Sci & Technol, Sch Business, Shanghai 200237, Peoples R China
关键词
ride-hailing; nonstationary stochastic demand; change point detection; non-stationary Markov decision; deep reinforcement learning; SERVICES;
D O I
10.3390/su16156289
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The ride-hailing market often experiences significant fluctuations in traffic demand, resulting in supply-demand imbalances. In this regard, the dynamic subsidy strategy is frequently employed by ride-hailing platforms to incentivize drivers to relocate to zones with high demand. However, determining the appropriate amount of subsidy at the appropriate time remains challenging. First, traffic demand exhibits high non-stationarity, characterized by multi-context patterns with time-varying statistical features. Second, high-dimensional state/action spaces contain multiple spatiotemporal dimensions and context patterns. Third, decision-making should satisfy real-time requirements. To address the above challenges, we first construct a Non-Stationary Markov Decision Process (NSMDP) based on the assumption of ride-hailing service systems dynamics. Then, we develop a solution framework for the NSMDP. A change point detection method based on feature-enhanced LSTM within the framework can identify the changepoints and time-varying context patterns of stochastic demand. Moreover, the framework also includes a deterministic policy deep reinforcement learning algorithm to optimize. Finally, through simulated experiments with real-world historical data, we demonstrate the effectiveness of the proposed approach. It performs well in improving the platform's profits and alleviating supply-demand imbalances under the dynamic subsidy strategy. The results also prove that a scientific dynamic subsidy strategy is particularly effective in the high-demand context pattern with more drastic fluctuations. Additionally, the profitability of dynamic subsidy strategy will increase with the increase of the non-stationary level.
引用
收藏
页数:25
相关论文
共 35 条
[1]  
[Anonymous], 2019, Dependable Earnings
[2]  
[Anonymous], 2020, TLC trip record data
[3]  
Banerjee S, 2015, PRICING RIDE SHARE P, DOI [DOI 10.2139/SSRN.2568258, 10.2139/ssrn.2568258]
[4]   Spatial Pricing in Ride-Sharing Networks [J].
Bimpikis, Kostas ;
Candogan, Ozan ;
Saban, Daniela .
OPERATIONS RESEARCH, 2019, 67 (03) :744-769
[5]   Surge Pricing Solves the Wild Goose Chase [J].
Castillo, Juan Camilo ;
Knoepfle, Dan ;
Weyl, Glen .
EC'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON ECONOMICS AND COMPUTATION, 2017, :241-242
[6]   Spatial-temporal pricing for ride-sourcing platform with reinforcement learning [J].
Chen, Chuqiao ;
Yao, Fugen ;
Mo, Dong ;
Zhu, Jiangtao ;
Chen, Xiqun .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 130
[7]   InBEDE: Integrating Contextual Bandit with TD Learning for Joint Pricing and Dispatch of Ride-Hailing Platforms [J].
Chen, Haipeng ;
Jiao, Yan ;
Qin, Zhiwei ;
Tang, Xiaocheng ;
Li, Hao ;
An, Bo ;
Zhu, Hongtu ;
Ye, Jieping .
2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, :61-70
[8]   Prices and subsidies in the sharing economy [J].
Fang, Zhixuan ;
Huang, Longbo ;
Wierman, Adam .
PERFORMANCE EVALUATION, 2019, 136
[9]   We Are on the Way: Analysis of On-Demand Ride-Hailing Systems [J].
Feng, Guiyun ;
Kong, Guangwen ;
Wang, Zizhuo .
M&SOM-MANUFACTURING & SERVICE OPERATIONS MANAGEMENT, 2021, 23 (05) :1237-1256
[10]   Scalable Deep Reinforcement Learning for Ride-Hailing [J].
Feng, Jiekun ;
Gluzman, Mark ;
Dai, J. G. .
IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (06) :2060-2065