Toward a Fully-Observable Markov Decision Process With Generative Models for Integrated 6G-Non-Terrestrial Networks

被引:7
作者
Machumilane, A. [1 ]
Cassara, P. [2 ]
Gotta, A. [2 ]
机构
[1] Univ Pisa, Dept Informat Engn, I-56126 Pisa, Italy
[2] CNR, Inst Informat Sci & Technol, I-56124 Pisa, Italy
来源
IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY | 2023年 / 4卷
关键词
NTN; satellite; generative models (GMs); reinforcement learning; actor-critic; multipath; traffic scheduling;
D O I
10.1109/OJCOMS.2023.3307209
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The upcoming sixth generation (6G) mobile networks require integration between terrestrial mobile networks and non-terrestrial networks (NTN) such as satellites and high altitude platforms (HAPs) to ensure wide and ubiquitous coverage, high connection density, reliable communications and high data rates. The main challenge in this integration is the requirement for line-of-sight (LOS) communication between the user equipment (UE) and the satellite. In this paper, we propose a framework based on actor-critic reinforcement learning and generative models for LOS estimation and traffic scheduling on multiple links connecting a user equipment to multiple satellites in 6G-NTN integrated networks. The agent learns to estimate the LOS probabilities of the available channels and schedules traffic on appropriate links to minimise end-to-end losses with minimal bandwidth. The learning process is modelled as a partially observable Markov decision process (POMDP), since the agent can only observe the state of the channels it has just accessed. As a result, the learning agent requires a longer convergence time compared to the satellite visibility period at a given satellite elevation angle. To counteract this slow convergence, we use generative models to transform a POMDP into a fully observable Markov decision process (FOMDP). We use generative adversarial networks (GANs) and variational autoencoders (VAEs) to generate synthetic channel states of the channels that are not selected by the agent during the learning process, allowing the agent to have complete knowledge of all channels, including those that are not accessed, thus speeding up the learning process. The simulation results show that our framework enables the agent to converge in a short time and transmit with an optimal policy for most of the satellite visibility period, which significantly reduces end-to-end losses and saves bandwidth. We also show that it is possible to train generative models in real time without requiring prior knowledge of the channel models and without slowing down the learning process or affecting the accuracy of the models.
引用
收藏
页码:1913 / 1930
页数:18
相关论文
共 45 条
[31]  
Rajan D, 2018, IEEE ENG MED BIO, P2571, DOI 10.1109/EMBC.2018.8512757
[32]  
Recommendation-ITU-R, 2017, ITU-Recommendation P.681-10
[33]   HumanGAN: A Generative Model of Human Images [J].
Sarkar, Kripasindhu ;
Liu, Lingjie ;
Golyanik, Vladislav ;
Theobalt, Christian .
2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, :258-267
[34]   Stacking Ensemble Learning for Non-Line-of-Sight Detection of Global Navigation Satellite System [J].
Sun, Yuan ;
Fu, Li .
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
[35]  
Technical Specification Group Radio Access Network, 2021, 3GPP Standard TR 38.821
[36]   Joint Optimization of Caching, Computing, and Radio Resources for Fog-Enabled IoT Using Natural Actor-Critic Deep Reinforcement Learning [J].
Wei, Yifei ;
Yu, F. Richard ;
Song, Mei ;
Han, Zhu .
IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (02) :2061-2073
[37]   Partially observable Markov decision processes for spoken dialog systems [J].
Williams, Jason D. ;
Young, Steve .
COMPUTER SPEECH AND LANGUAGE, 2007, 21 (02) :393-422
[38]   Peekaboo: Learning-Based Multipath Scheduling for Dynamic Heterogeneous Environments [J].
Wu, Hongjia ;
Alay, Ozgu ;
Brunstrom, Anna ;
Ferlin, Simone ;
Caso, Giuseppe .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2020, 38 (10) :2295-2310
[39]   Goodput-Aware Load Distribution for Real-Time Traffic over Multipath Networks [J].
Wu, Jiyan ;
Yuen, Chau ;
Cheng, Bo ;
Shang, Yanlei ;
Chen, Junliang .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (08) :2286-2299
[40]   Generative Neural Network Channel Modeling for Millimeter-Wave UAV Communication [J].
Xia, William ;
Rangan, Sundeep ;
Mezzavilla, Marco ;
Lozano, Angel ;
Geraci, Giovanni ;
Semkin, Vasilii ;
Loianno, Giuseppe .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (11) :9417-9431