A Reinforcement Learning Framework for Efficient Informative Sensing

被引:6
作者
Wei, Yongyong [1 ]
Zheng, Rong [1 ]
机构
[1] McMaster Univ, Dept Comp & Software, Hamilton, ON L8S 4L8, Canada
关键词
Path planning; Robot sensing systems; Robots; Reinforcement learning; Optimization; Wireless sensor networks; Prediction algorithms; Informative path planning; mobile sensing; reinforcement learning; ORIENTEERING PROBLEM;
D O I
10.1109/TMC.2020.3040945
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large-scale spatial data can be collected using mobile robots with sensing and navigation capabilities. Due to limited battery lifetime and scarcity of charging stations, it is important to plan informative paths so as to maximize the utility of data given a limited travel budget, which is known as the informative path planning (IPP) problem. IPP is NP-hard, and existing solutions suffer from high complexity or low optimality. In this paper, we present a novel IPP solution based on reinforcement learning (RL). The basic idea is to learn the structural characteristics of informative paths, so informative paths can be predicted. As such, when budgets change, we avoid solving the problem from scratch and thus path planning efficiency can be improved dramatically. Among the 20 path planning experiments in two areas, the proposed RL based solution achieves the best path utility in 15 experiments, compared with state-of-the-art algorithms. More importantly, the inference complexity is linear with respect to the budget (equivalently, the maximum number of steps in RL), which is lower than other solutions. Despite the NP-hardness, the path planning process can be finished within a few seconds in our experiments on two graphs of different sizes.
引用
收藏
页码:2306 / 2317
页数:12
相关论文
共 45 条
[1]   Active sensing for motion planning in uncertain environments via mutual information policies [J].
A Macdonald, Ryan ;
Smith, Stephen L. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2019, 38 (2-3) :146-161
[2]   Wireless sensor networks: a survey [J].
Akyildiz, IF ;
Su, W ;
Sankarasubramaniam, Y ;
Cayirci, E .
COMPUTER NETWORKS, 2002, 38 (04) :393-422
[3]  
[Anonymous], 2018, Variance Reduction for Policy Gradient with Action-dependent Factorized Baselines
[4]  
Arora Sankalp, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P4997, DOI 10.1109/ICRA.2017.7989582
[5]  
Bello I., 2017, PROC 5 INT C LEARN R
[6]   Informative Path Planning for an Autonomous Underwater Vehicle [J].
Binney, Jonathan ;
Krause, Andreas ;
Sukhatme, Gaurav S. .
2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, :4791-4796
[7]   Orienteering-based informative path planning for environmental monitoring [J].
Bottarelli, Lorenzo ;
Bicego, Manuele ;
Blum, Jason ;
Farinelli, Alessandro .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 77 :46-58
[8]   A recursive greedy algorithm for walks in directed graphs [J].
Chekuri, C ;
Pál, M .
46TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2005, :245-253
[9]  
Cook W., 2015, CONCORDE TSP SOLVER
[10]  
Dai HJ, 2017, ADV NEUR IN, V30