Online Learning-Based Beamforming for Rate-Splitting Multiple Access: A Constrained Bandit Approach
被引:0
|
作者:
Wang, Shangshang
论文数: 0引用数: 0
h-index: 0
机构:
ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R ChinaShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
Wang, Shangshang
[1
]
Wang, Jingye
论文数: 0引用数: 0
h-index: 0
机构:
ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R ChinaShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
Wang, Jingye
[1
]
Mao, Yijie
论文数: 0引用数: 0
h-index: 0
机构:
ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R ChinaShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
Mao, Yijie
[1
]
Shao, Ziyu
论文数: 0引用数: 0
h-index: 0
机构:
ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R ChinaShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
Shao, Ziyu
[1
]
机构:
[1] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
来源:
ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS
|
2023年
基金:
上海市自然科学基金;
关键词:
PARTIAL CSIT;
D O I:
10.1109/ICC45041.2023.10278992
中图分类号:
TN [电子技术、通信技术];
学科分类号:
0809 ;
摘要:
Rate-splitting multiple access (RSMA) has emerged as a potential non-orthogonal transmission strategy and powerful interference management scheme for 6G. Most of the existing works on RSMA beamforming design assume instantaneous or statistical channel state information (CSI) is available at the transmitter. Such an assumption however is impractical especially in massive multiple-input multiple-output (MIMO) due to the dynamic wireless environments and the challenges in channel estimation. In this work, we propose a novel beamforming design framework based on online learning and online control to adaptively learn the best precoding action for a RSMA-aided downlink massive MIMO without explicit CSI feedback. In particular, we first formulate the precoder selection problem that maximizes the ergodic sum-rate subject to a long-term transmit power constraint as a constrained combinatorial multi-armed bandit (CMAB) problem. Then we propose a precoder selection with bandit learning algorithm for RSMA (PBR). Our theoretical analysis shows that PBR achieves a sublinear regret bound with a long-term power constraint guarantee. Through experimental results, we not only verify our theoretical analysis but also demonstrate the outperformance of PBR in terms of sum-rate and power consumption compared with the conventional transmission schemes without using RSMA.