Fast Link Scheduling in Wireless Networks Using Regularized Off-Policy Reinforcement Learning

被引：1

作者：

Bhattacharya, Sagnik ^{[1
]}

Banerjee, Ayan ^{[1
]}

Peruru, Subrahmanya Swamy ^{[1
]}

Srinivas, Kothapalli Venkata ^{[2
]}

机构：

[1] IIT Kanpur, Department of Electrical Engineering, Kanpur

[2] Motorola Mobility, Chicago, 60654, IL

来源：

IEEE Networking Letters | 2023年 / 5卷 / 02期

关键词：

graph neural network; maximal independent set; Reinforcement learning; scheduling; time since last service;

D O I：

10.1109/LNET.2023.3264486

中图分类号：

学科分类号：

摘要：

The centralized-link-scheduling problem in a wireless network graph involves solving the maximum-weighted-independent-set (MWIS) problem on the conflict graph. In this letter, we propose a novel regularized off-policy reinforcement learning-based MWIS solver and use for the scheduling problem. The proposed MWIS algorithm achieves 17% improvement over state-of-the-art heuristic solver KaMIS, 60% over greedy solver, 16% and 17% over RL-based solvers LwD and S2V-DQN, respectively. We show that our scheduler achieves stable throughput values 14% and 22% higher than LwD and a distributed greedy scheduler, respectively. We demonstrate the flexibility of our RL algorithm by modifying it to create a time-since-last-service-aware scheduler. © 2019 IEEE.

引用

页码：86 / 90

页数：4

共 15 条

[1]

Niu Y., Et al., Energy-efficient scheduling for mmWave backhauling of small cells in heterogeneous cellular networks, IEEE Trans. Veh. Technol., 66, 3, pp. 2674-2687, (2017)

[2]

Joo C., Lin X., Ryu J., Shroff N.B., Distributed greedy approximation to maximum weighted independent set for scheduling with fading channels, IEEE/ACM Trans. Netw., 24, 3, pp. 1476-1488, (2016)

[3]

Zhao Z., Verma G., Rao C., Swami A., Segarra S., Distributed scheduling using graph neural networks, Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 4720-4724, (2021)

[4]

Lamm S., Schulz C., Strash D., Williger R., Zhang H., Exactly solving the maximum weight independent set problem on large realworld graphs, Proc. 21st Workshop Algorithm Eng. Exp. (ALENEX), pp. 144-158, (2019)

[5]

Gurobi optimizer reference manual, (2022)

[6]

Feo T.A., Resende M.G.C., Smith S.H., A greedy randomized adaptive search procedure for maximum independent set, Oper. Res., 42, 5, pp. 860-878, (1994)

[7]

Ahn S., Seo Y., Shin J., Learning what to defer for maximum independent sets, Proc. 37th Int. Conf. Mach. Learn., 119, pp. 134-144, (2020)

[8]

Schulman J., Wolski F., Dhariwal P., Radford A., Klimov O., Proximal policy optimization algorithms, (2017)

[9]

Hausknecht M., Stone P., On-policy vs. off-policy updates for deep reinforcement learning, Proc. Deep Reinforcement Learn. Front. Challenges IJCAI Workshop, pp. 1-7, (2016)

[10]

Hessel M., Et al., Muesli: Combining improvements in policy optimization, Proc. 38th Int. Conf. Mach. Learn., 139, pp. 4214-4226, (2021)

← 1 2 →