Deep reinforcement learning for wireless sensor scheduling in cyber-physical systems

被引:100
作者
Leong, Alex S. [1 ]
Ramaswamy, Arunselvan [1 ]
Quevedo, Daniel E. [1 ]
Karl, Holger [1 ]
Shi, Ling [2 ]
机构
[1] Paderborn Univ, Fac Comp Sci Elect Engn & Math, Paderborn, Germany
[2] Hong Kong Univ Sci & Technol, Dept Elect & Comp Engn, Hong Kong, Peoples R China
关键词
TRANSMISSION; ALGORITHMS;
D O I
10.1016/j.automatica.2019.108759
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many cyber-physical systems, we encounter the problem of remote state estimation of geographically distributed and remote physical processes. This paper studies the scheduling of sensor transmissions to estimate the states of multiple remote, dynamic processes. Information from the different sensors has to be transmitted to a central gateway over a wireless network for monitoring purposes, where typically fewer wireless channels are available than there are processes to be monitored. For effective estimation at the gateway, the sensors need to be scheduled appropriately, i.e., at each time instant one needs to decide which sensors have network access and which ones do not. To address this scheduling problem, we formulate an associated Markov decision process (MDP). This MDP is then solved using a Deep Q-Network, a recent deep reinforcement learning algorithm that is at once scalable and model-free. We compare our scheduling algorithm to popular scheduling algorithms such as round-robin and reduced-waiting-time, among others. Our algorithm is shown to significantly outperform these algorithms for many example scenarios. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:8
相关论文
共 33 条
[1]   Learning algorithms or Markov decision processes with average cost [J].
Abounadi, J ;
Bertsekas, D ;
Borkar, VS .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2001, 40 (03) :681-698
[2]  
Anderson B. D., 2012, OPTIMAL FILTERING
[3]  
[Anonymous], 2013, P WORKSH DEEP LEARN
[4]  
[Anonymous], 2011, Wireless Communications
[5]  
[Anonymous], P AM CONTR C
[6]  
[Anonymous], 2005, Dynamic Programming & Optimal Control
[7]  
[Anonymous], 2015, Optimization
[8]  
Baumann D., 2018, P IEEE C DEC CONTR
[9]  
Bertsekas D., 2012, Dynamic Programming and Optimal Control, V4
[10]   Fair scheduling with tunable latency: A round-robin approach [J].
Chaskar, HM ;
Madhow, U .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2003, 11 (04) :592-601