Single and Multi-Agent Deep Reinforcement Learning for AI-Enabled Wireless Networks: A Tutorial

被引：225

作者：

Feriani, Amal ^{[1
]}

Hossain, Ekram ^{[1
]}

机构：

[1] Univ Manitoba, Dept Elect & Comp Engn, Winnipeg, MB R2M 2J8, Canada

来源：

IEEE COMMUNICATIONS SURVEYS AND TUTORIALS | 2021年 / 23卷 / 02期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Tutorials; Wireless networks; Games; Computational modeling; Training; 5G mobile communication; Reinforcement learning; AI-enabled wireless networks; deep reinforcement learning (DRL); multi-agent reinforcement learning (MARL); model-based reinforcement learning (MBRL); decentralized networks; MULTICHANNEL ACCESS;

D O I：

10.1109/COMST.2021.3063822

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep Reinforcement Learning (DRL) has recently witnessed significant advances that have led to multiple successes in solving sequential decision-making problems in various domains, particularly in wireless communications. The next generation of wireless networks is expected to provide scalable, low-latency, ultra-reliable services empowered by the application of data-driven Artificial Intelligence (AI). The key enabling technologies of future wireless networks, such as intelligent meta-surfaces, aerial networks, and AI at the edge, involve more than one agent which motivates the importance of multi-agent learning techniques. Furthermore, cooperation is central to establishing self-organizing, self-sustaining, and decentralized networks. In this context, this tutorial focuses on the role of DRL with an emphasis on deep Multi-Agent Reinforcement Learning (MARL) for AI-enabled wireless networks. The first part of this paper will present a clear overview of the mathematical frameworks for single-agent RL and MARL. The main idea of this work is to motivate the application of RL beyond the model-free perspective which was extensively adopted in recent years. Thus, we provide a selective description of RL algorithms such as Model-Based RL (MBRL) and cooperative MARL and we highlight their potential applications in future wireless networks. Finally, we overview the state-of-the-art of MARL in fields such as Mobile Edge Computing (MEC), Unmanned Aerial Vehicles (UAV) networks, and cell-free massive MIMO, and identify promising future research directions. We expect this tutorial to stimulate more research endeavors to build scalable and decentralized systems based on MARL.

引用

页码：1226 / 1252

页数：27

共 132 条

[1]

Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265

[2]

Abbeel P., 2015, ASS P 17 INT ACM, P1889, DOI DOI 10.1145/2700648.2809870

[3]

Ahmed K. I., 2019, ARXIV190413032

[4] 6G and Beyond: The Future of Wireless Communications Systems [J].

Akyildiz, Ian F. ;

Kak, Ahan ;

Nie, Shuai .

IEEE ACCESS, 2020, 8 :133995-134030

[5] Multiple Access in Cell-Free Networks: Outage Performance, Dynamic Clustering, and Deep Reinforcement Learning-Based Design [J].

Al-Eryani, Yasser ;

Akrout, Mohamed ;

Hossain, Ekram .

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (04) :1028-1042

[6]

Ali S., 2020, 6G WHITE PAPER MACHI

[7] A Survey on Multi-Agent Reinforcement Learning Methods for Vehicular Networks [J].

Althamary, Ibrahim ;

Huang, Chih-Wei ;

Lin, Phone .

2019 15TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE (IWCMC), 2019, :1154-1159

[8]

[Anonymous], 2016, A Concise Introduction to Decentralized POMDPs, DOI DOI 10.1007/978-3-319-28929-8

[9]

[Anonymous], 2017, ARXIV170709183

[10] Distributed Deep Reinforcement Learning for Functional Split Control in Energy Harvesting Virtualized Small Cells [J].

Azene Temesgene, Dagnachew ;

Miozzo, Marco ;

Gunduz, Deniz ;

Dini, Paolo .

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2021, 6 (04) :626-640

← 1 2 3 4 5 6 7 8 9 10 →