Human-in-the-Loop Learning for Dynamic Congestion Games

被引：2

作者：

Li, Hongbo ^{[1
]}

Duan, Lingjie ^{[1
]}

机构：

[1] Singapore Univ Technol & Design, Pillar Engn Syst & Design, Singapore 487372, Singapore

来源：

IEEE TRANSACTIONS ON MOBILE COMPUTING | 2024年 / 23卷 / 12期

关键词：

Stochastic processes; Games; Routing; Hazards; Costs; Mobile computing; Human in the loop; Dynamic congestion games; human-in-the-loop learning; mechanism design; price of anarchy; BANDIT;

D O I：

10.1109/TMC.2024.3391697

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Today mobile users learn and share their traffic observations via crowdsourcing platforms (e.g., Google Maps and Waze). Yet such platforms simply cater to selfish users' myopic interests to recommend the shortest path, and do not encourage enough users to travel and learn other paths for future others. Prior studies focus on one-shot congestion games without considering users' information learning, while our work studies how users learn and alter traffic conditions on stochastic paths in a human-in-the-loop manner. In a typical parallel routing network with one deterministic path and multiple stochastic paths, our analysis shows that the myopic routing policy (used by Google Maps and Waze) leads to severe under-exploration of stochastic paths. This results in a price of anarchy (PoA) greater than 2, as compared to the socially optimal policy achieved through optimal exploration-exploitation tradeoff in minimizing the long-term social cost. Besides, the myopic policy fails to ensure the correct learning convergence about users' traffic hazard beliefs. To address this, we focus on informational (non-monetary) mechanisms as they are easier to implement than pricing. We first show that existing information-hiding mechanisms and deterministic path-recommendation mechanisms in Bayesian persuasion literature do not work with even PoA=infinity. Accordingly, we propose a new combined hiding and probabilistic recommendation (CHAR) mechanism to hide all information from a selected user group and provide state-dependent probabilistic recommendations to the other user group. Our CHAR mechanism successfully ensures PoA less than $\frac{5}{4}$54, which cannot be further reduced by any other informational (non-monetary) mechanism. Besides the parallel network, we further extend our analysis and CHAR mechanism to more general linear path graphs with multiple intermediate nodes, and we prove that the PoA results remain unchanged. Additionally, we carry out experiments with real-world datasets to further extend our routing graphs and verify the close-to-optimal performance of our CHAR mechanism.

引用

页码：11159 / 11171

页数：13

共 48 条

[1]

Alam Ishteaque, 2019, Emerging Technologies in Data Mining and Information Security. Proceedings of IEMIS 2018. Advances in Intelligent Systems and Computing (AISC 813), P661, DOI 10.1007/978-981-13-1498-8_58

[2] Regret-minimizing Bayesian persuasion [J].

Babichenko, Yakov ;

Talgam-Cohen, Inbal ;

Xu, Haifeng ;

Zabarnyi, Konstantin .

GAMES AND ECONOMIC BEHAVIOR, 2022, 136 :226-248

[3]

BaiduMap B., 2023, Baidu maps open platform

[4]

Ban X., 2009, Transp. Res. Rec., V2130

[5]

BERTSIMAS D, 1997, INTRO LINEAR OPTIMIZ

[6] Computation Offloading in Heterogeneous Vehicular Edge Networks: On-Line and Off-Policy Bandit Solutions [J].

Bozorgchenani, Arash ;

Maghsudi, Setareh ;

Tarchi, Daniele ;

Hossain, Ekram .

IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (12) :4233-4248

[7] PATH GRAPHS [J].

BROERSMA, HJ ;

HOEDE, C .

JOURNAL OF GRAPH THEORY, 1989, 13 (04) :427-444

[8] Pure strategy Nash equilibria of large finite-player games and their relationship to non-atomic games [J].

Carmona, Guilherme ;

Podczeck, Konrad .

JOURNAL OF ECONOMIC THEORY, 2020, 187

[9]

Chen Z., 2016, PROC IEEE 24 INT C N, P16

[10]

Das S., 2017, PROC 55 ANN ALLERTON

← 1 2 3 4 5 →