Routing Selection With Reinforcement Learning for Energy Harvesting Multi-Hop CRN

被引:27
作者
He, Xiaoli [1 ,2 ]
Jiang, Hong [1 ]
Song, Yu [1 ,3 ]
He, Chunlin [4 ]
Xiao, He [1 ]
机构
[1] South West Univ Sci & Technol, Sch Informat Engn, Mianyang 621010, Peoples R China
[2] Sichuan Univ Sci & Engn, Sch Comp Sci, Zigong 643000, Peoples R China
[3] Sichuan Univ Sci & Engn, Dept Network Informat Management Ctr, Zigong 643000, Peoples R China
[4] China West Normal Univ, Sch Comp Sci, Nanchong 637009, Peoples R China
基金
中国国家自然科学基金;
关键词
Routing selection; multi-hop CRN; energy harvesting; Q learning; reinforcement learning; MDP; COGNITIVE RADIO NETWORKS; ALLOCATION;
D O I
10.1109/ACCESS.2019.2912996
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper considers the routing problem in the communication process of an energy harvesting (EH) multi-hop cognitive radio network (CRN). The transmitter and the relay harvest energy from the environment use it exclusively for transmitting data. In a relay on the path, a limited data buffer is used to store the received data and forward it. We consider a real-world scenario where the EH node has only local causal knowledge, i.e., at any time, each EH node only has knowledge of its own EH process, channel state, and currently received data. An EH routing algorithm based on Q learning in reinforcement learning (RL) for multi-hop CRNs (EHR-QL) is proposed. Our goal is to find an optimal routing policy that can maximize throughput and minimize energy consumption. Through continuous intelligent selection under the partially observable Markov decision process (POMDP), we use the Q learning algorithm in RL with linear function approximation to obtain the optimal path. Compared with the basic Q learning routes, the EHR-QL is superior for longer distances and higher hop counts. The algorithm produces more EH, less energy consumption, and predictable residual energy. In particular, the time complexity of the EHR-QL is analyzed and its convergence is proved. In the simulation experiments, first, we verify the EHR-QL using six EH secondary users (EH-SUs) nodes. Second, the performance (i.e., network lifetime, residual energy, and average throughput) of the EHR-QL is evaluated under the influences of different the learning rates alpha and discount factors gamma. Finally, the experimental results show that the EHR-QL obtains a higher throughput, a longer network lifetime, less latency, and lower energy consumption than the basic Q learning routing algorithms.
引用
收藏
页码:54435 / 54448
页数:14
相关论文
共 19 条
[1]  
[Anonymous], 2016, 2016 IEEE GLOB WORKS
[2]  
[Anonymous], 2018, P INT S LOW POW EL D
[3]   Joint Power Allocation and Route Selection for Outage Minimization in Multihop Cognitive Radio Networks with Energy Harvesting [J].
Banerjee, Avik ;
Paul, Anal ;
Maity, Santi Prasad .
IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2018, 4 (01) :82-92
[4]  
Boyan J.A., 1994, Advances in Neural Information Processing Systems, V6
[5]   On the Route Priority for Cognitive Radio Networks [J].
Cacciapuoti, Angela Sara ;
Caleffi, Marcello ;
Marino, Francesco ;
Paura, Luigi .
IEEE TRANSACTIONS ON COMMUNICATIONS, 2015, 63 (09) :3103-3117
[6]   Energy Efficient Constrained Shortest Path First-Based Joint Resource Allocation and Route Selection for Multi-Hop CRNs [J].
Chen, Qianbin ;
Wang, Ling ;
Gao, Yuanpeng ;
Chai, Rong ;
Huang, Xiaoge .
CHINA COMMUNICATIONS, 2017, 14 (12) :72-86
[7]   Joint Routing and Links Scheduling in Two-Tier Multi-Hop RF-Energy Harvesting Networks [J].
Chin, Kwan-Wu ;
Wang, Luyao ;
Soh, Sieteng .
IEEE COMMUNICATIONS LETTERS, 2016, 20 (09) :1864-1867
[8]   A high-throughput path metric for multi-hop wireless routing [J].
De Couto, DSJ ;
Aguayo, D ;
Bicket, J ;
Morris, R .
WIRELESS NETWORKS, 2005, 11 (04) :419-434
[9]  
Haque Md Enamul, 2015, 2015 IEEE Sensors. Proceedings, P1, DOI 10.1109/ICSENS.2015.7370618
[10]   Multihop Cognitive Radio Networks: To Route or Not to Route [J].
Khalife, Hicham ;
Malouch, Naceur ;
Fdida, Serge .
IEEE NETWORK, 2009, 23 (04) :20-25