Balancing Performance and Cost for Two-Hop Cooperative Communications: Stackelberg Game and Distributed Multi-Agent Reinforcement Learning

被引:0
作者
Geng, Yuanzhe [1 ]
Liu, Erwu [1 ]
Ni, Wei [2 ]
Wang, Rui [3 ]
Liu, Yan [1 ]
Xu, Hao [1 ]
Cai, Chen [4 ]
Jamalipour, Abbas [5 ]
机构
[1] Tongji Univ, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[2] Commonwealth Sci & Ind Res Org, Data61, Marsfield, NSW 2122, Australia
[3] Tongji Univ, Coll Elect & Informat Engn, Shanghai Inst Intelligent Sci & Technol, Shanghai 201804, Peoples R China
[4] Tongji Univ, Inst Carbon Neutral, Coll Environm Sci & Engn, Shanghai 200092, Peoples R China
[5] Univ Sydney, Sch Elect & Informat Engn, Fac Engn, Sydney, NSW 2006, Australia
基金
美国国家科学基金会;
关键词
Relays; Games; Optimization; Cooperative communication; Costs; Channel capacity; Signal to noise ratio; power control; multi-agent reinforcement learning; Stackelberg game; DETERMINISTIC POLICY GRADIENT; RELAY SELECTION; ALLOCATION; POWER; OPTIMIZATION;
D O I
10.1109/TCCN.2024.3400516
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
This paper aims to balance performance and cost in a two-hop wireless cooperative communication network where the source and relays have contradictory optimization goals and make decisions in a distributed manner. This differs from most existing works that have typically assumed that source and relay nodes follow a schedule created implicitly by a central controller. We propose that the relays form an alliance in an attempt to maximize the benefit of relaying while the source aims to increase the channel capacity cost-effectively. To this end, we establish the trade problem as a Stackelberg game, and prove the existence of its equilibrium. Another important aspect is that we use multi-agent reinforcement learning (MARL) to approach the equilibrium in a situation where the instantaneous channel state information (CSI) is unavailable, and the source and relays do not have knowledge of each other's goal. A multi-agent deep deterministic policy gradient-based framework is designed, where the relay alliance and the source act as agents. Experiments demonstrate that the proposed method can obtain an acceptable performance that is close to the game-theoretic equilibrium for all players under time-invariant environments, which considerably outperforms its potential alternatives and is only about 2.9% away from the optimal solution.
引用
收藏
页码:2193 / 2208
页数:16
相关论文
共 39 条
  • [1] Joint Power and Time Allocation for Two-Way Cooperative NOMA
    Bae, Jimin
    Han, Youngnam
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2019, 68 (12) : 12443 - 12447
  • [2] Basar T., 1998, DYNAMIC NONCOOPERATI
  • [3] Intelligent Joint Network Slicing and Routing via GCN-Powered Multi-Task Deep Reinforcement Learning
    Dong, Tianjian
    Zhuang, Zirui
    Qi, Qi
    Wang, Jingyu
    Sun, Haifeng
    Yu, F. Richard
    Sun, Tao
    Zhou, Cheng
    Liao, Jianxin
    [J]. IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2022, 8 (02) : 1269 - 1286
  • [4] Fiez T., 2020, PR MACH LEARN RES
  • [5] Foerster JN, 2018, AAAI CONF ARTIF INTE, P2974
  • [6] Game Combined Multi-Agent Reinforcement Learning Approach for UAV Assisted Offloading
    Gao, Ang
    Wang, Qi
    Liang, Wei
    Ding, Zhiguo
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (12) : 12888 - 12901
  • [7] Hierarchical Reinforcement Learning for Relay Selection and Power Optimization in Two-Hop Cooperative Relay Network
    Geng, Yuanzhe
    Liu, Erwu
    Wang, Rui
    Liu, Yiming
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2022, 70 (01) : 171 - 184
  • [8] Deep Deterministic Policy Gradient for Relay Selection and Power Allocation in Cooperative Communication Network
    Geng, Yuanzhe
    Liu, Erwu
    Wang, Rui
    Liu, Yiming
    Wang, Jie
    Shen, Gang
    Dong, Zhao
    [J]. IEEE WIRELESS COMMUNICATIONS LETTERS, 2021, 10 (09) : 1969 - 1973
  • [9] Optimization-Driven Hierarchical Learning Framework for Wireless Powered Backscatter-Aided Relay Communications
    Gong, Shimin
    Zou, Yuze
    Xu, Jing
    Dinh Thai Hoang
    Lyu, Bin
    Niyato, Dusit
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (02) : 1378 - 1391
  • [10] Distributed Power Control for Delay Optimization in Energy Harvesting Cooperative Relay Networks
    Hakami, Vesal
    Dehghan, Mehdi
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2017, 66 (06) : 4742 - 4755