Learning to Entangle Radio Resources in Vehicular Communications: An Oblivious Game-Theoretic Perspective

被引:12
作者
Chen, Xianfu [1 ]
Wu, Celimuge [2 ]
Bennis, Mehdi [3 ]
Zhao, Zhifeng [4 ]
Han, Zhu [5 ,6 ]
机构
[1] VTT Tech Res Ctr Finland, Oulu 1000, Finland
[2] Univ Electrocommun, Grad Sch Informat & Engn, Tokyo 1828585, Japan
[3] Univ Oulu, Ctr Wireless Commun, Oulu 90014, Finland
[4] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310027, Peoples R China
[5] Univ Houston, Houston, TX 77004 USA
[6] Kyung Hee Univ, Dept Comp Sci & Engn, Seoul 02447, South Korea
基金
芬兰科学院; 美国国家科学基金会;
关键词
Vehicle-to-vehicle communications; Markov decision process; stochastic games; Markov perfect equilibrium; oblivious equilibrium; reinforcement learning; NETWORK; INFORMATION;
D O I
10.1109/TVT.2019.2907589
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we investigate non-cooperative radio resource management in a vehicle-to-vehicle communication network. The technical challenges lie in high-vehicle mobility and data traffic variations. Over the discrete scheduling slots, each vehicle user equipment (VUE)-pair competes with other VUE-pairs in the coverage of a road side unit (RSU) for the limited frequency to transmit queued data packets, aiming to optimize the expected long-term performance. The frequency allocation at the beginning of each slot at the RSU is regulated by a sealed second-price auction. Such interactions among VUE-pairs are modeled as a stochastic game with a semi-continuous global network state space. By defining a partitioned control policy, we transform the original game into an equivalent stochastic game with a global queue state space of finite size. We adopt an oblivious equilibrium (OE) to approximate the Markov perfect equilibrium, which characterizes the optimal solution to the equivalent game. The OE solution is theoretically proven to be with an asymptotic Markov equilibrium property. Due to the lack of a priori knowledge of network dynamics, we derive an online algorithm to learn the OE solution. Numerical simulations validate the theoretical analysis and show the effectiveness of the proposed online learning algorithm.
引用
收藏
页码:4262 / 4274
页数:13
相关论文
共 39 条
  • [21] Ultra-Reliable and Low-Latency Vehicular Transmission: An Extreme Value Theory Approach[J]. Liu, Chen-Feng;Bennis, Mehdi. IEEE COMMUNICATIONS LETTERS, 2018(06)
  • [22] Ma XC, 2017, IEEE CONF COMPUT, P874, DOI 10.1109/INFCOMW.2017.8116491
  • [23] Joint Physical-Layer and System-Level Power Management for Delay-Sensitive Wireless Communications[J]. Mastronarde, Nicholas;van der Schaar, Mihaela. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2013(04)
  • [24] EXISTENCE OF STATIONARY CORRELATED EQUILIBRIA WITH SYMMETRICAL INFORMATION FOR DISCOUNTED STOCHASTIC GAMES[J]. NOWAK, AS;RAGHAVAN, TES. MATHEMATICS OF OPERATIONS RESEARCH, 1992(03)
  • [25] Puterman Martin L, 1994, Markov Decision Processes: Discrete Stochastic Dynamic Programming
  • [26] Mobile Network Architecture Evolution toward 5G[J]. Rost, Peter;Banchs, Albert;Berberana, Ignacio;Breitbach, Markus;Doll, Mark;Droste, Heinz;Mannweiler, Christian;Puente, Miguel A.;Samdanis, Konstantinos;Sayadi, Bessem. IEEE COMMUNICATIONS MAGAZINE, 2016(05)
  • [27] An on-line learning algorithm for energy efficient delay constrained scheduling over a fading channel[J]. Salodkar, Nitin;Bhorkar, Abhijeet;Karandikar, Abhay;Borkar, Vivek S. IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2008(04)
  • [28] Radio Resource Management for D2D-Based V2V Communication[J]. Sun, Wanlu;Strom, Erik G.;Brannstrom, Fredrik;Sou, Kin Cheong;Sui, Yutao. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2016(08)
  • [29] Sutton R. S., 1998, Reinforcement Learning: An Introduction, V2
  • [30] COUNTERSPECULATION, AUCTIONS, AND COMPETITIVE SEALED TENDERS[J]. VICKREY, W. JOURNAL OF FINANCE, 1961(01)