Sensor Management Method Based on Deep Reinforcement Learning in Extended Target Tracking

被引:0
作者
Zhang, Hong-Yun [1 ]
Chen, Hui [1 ]
Zhang, Wen-Xu [1 ]
机构
[1] School of Electrical Engineering and Information Engineering, Lanzhou University of Technology, Lanzhou
来源
Zidonghua Xuebao/Acta Automatica Sinica | 2024年 / 50卷 / 07期
基金
中国国家自然科学基金;
关键词
deep reinforcement learning (DRL); extended target tracking (ETT); information gain; Sensor management; twin delayed deep deterministic policy gradient (TD3);
D O I
10.16383/j.aas.c230591
中图分类号
学科分类号
摘要
To solve the problem of sensor management in the optimization of extended target tracking (ETT), this paper proposes a sensor management method based on deep reinforcement learning (DRL) by modeling the extended target based on random matrices model (RMM). First, in the theoretical framework of partially observed Markov decision process (POMDP), a elementary method of sensor management for extended target tracking based on twin delayed deep deterministic policy gradient (TD3) algorithm is presented. After that, the Gaussian Wasserstein distance (GWD) is used to calculate the information gain between the prior probability density and the posterior probability density of the extended target, which is used to comprehensively evaluate the multi-feature estimation information of the extended target, and then the information gain is used as the reward function of TD3 algorithm. Furthermore, the optimal sensor management scheme based on deep reinforcement learning is decided by the derived reward function. Finally, the effectiveness of the proposed algorithm is verified by constructing an extended target tracking optimization simulation experiment. © 2024 Science Press. All rights reserved.
引用
收藏
页码:1417 / 1431
页数:14
相关论文
共 39 条
  • [1] Misra S, Singh A, Chatterjee S, Mandal A K., QoS-aware sensor allocation for target tracking in sensor-cloud, Ad Hoc Networks, 33, pp. 140-153, (2015)
  • [2] Song H, Xiao M, Xiao J, Liang Y, Yang Z., A POMDP approach for scheduling the usage of airborne electronic countermeasures in air operations, Aerospace Science and Technology, 48, pp. 86-93, (2016)
  • [3] Yan J, Jiao H, Pu W, Shi C, Dai J, Liu H., Radar sensor network resource allocation for fused target tracking: A brief review, Information Fusion, 86, pp. 104-115, (2022)
  • [4] Hero A O, Cochran D., Sensor management: Past, present, and future, IEEE Sensors Journal, 11, 12, pp. 3064-3075, (2011)
  • [5] Bello L L, Lombardo A, Milardo S, Patti G, Reno M., Experimental assessments and analysis of an SDN framework to integrate mobility management in industrial wireless sensor networks, IEEE Transactions on Industrial Informatics, 16, 8, pp. 5586-5595, (2020)
  • [6] Newell D, Duffy M., Review of power conversion and energy management for low-power, low-voltage energy harvesting powered wireless sensors, IEEE Transactions on Power Electronics, 34, 10, pp. 9794-9805, (2019)
  • [7] Shi C, Dai X, Wang Y, Zhou J, Salous S., Joint route optimization and multidimensional resource management scheme for airborne radar network in target tracking application, IEEE Systems Journal, 16, 4, pp. 6669-6680, (2021)
  • [8] Dai J, Pu W, Yan J, Shi Q, Liu H., Multi-UAV collaborative trajectory optimization for asynchronous 3D passive multitarget tracking, IEEE Transactions on Geoscience and Remote Sensing, 61, pp. 1-16, (2023)
  • [9] Zuo L, Hu J, Sun H, Gao Y., Resource allocation for target tracking in multiple radar architectures over lossy networks, Signal Processing, 208, pp. 108973-108984, (2023)
  • [10] Han D, Wu J, Zhang H, Shi L., Optimal sensor scheduling for multiple linear dynamical systems, Automatica, 75, pp. 260-270, (2017)