Space-Air-Ground Integrated Mobile Crowdsensing for Partially Observable Data Collection by Multi-Scale Convolutional Graph Reinforcement Learning

被引:4
作者
Ren, Yixiang [1 ]
Ye, Zhenhui [2 ]
Song, Guanghua [1 ]
Jiang, Xiaohong [2 ]
机构
[1] Zhejiang Univ, Sch Aeronaut & Astronaut, Hangzhou 310027, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China
基金
国家重点研发计划;
关键词
mobile crowdsensing; deep reinforcement learning; UAV control; graph network; maximum-entropy learning; COOPERATIVE SEARCH; COVERAGE;
D O I
10.3390/e24050638
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Mobile crowdsensing (MCS) is attracting considerable attention in the past few years as a new paradigm for large-scale information sensing. Unmanned aerial vehicles (UAVs) have played a significant role in MCS tasks and served as crucial nodes in the newly-proposed space-air-ground integrated network (SAGIN). In this paper, we incorporate SAGIN into MCS task and present a Space-Air-Ground integrated Mobile CrowdSensing (SAG-MCS) problem. Based on multi-source observations from embedded sensors and satellites, an aerial UAV swarm is required to carry out energy-efficient data collection and recharging tasks. Up to date, few studies have explored such multi-task MCS problem with the cooperation of UAV swarm and satellites. To address this multi-agent problem, we propose a novel deep reinforcement learning (DRL) based method called Multi-Scale Soft Deep Recurrent Graph Network (ms-SDRGN). Our ms-SDRGN approach incorporates a multi-scale convolutional encoder to process multi-source raw observations for better feature exploitation. We also use a graph attention mechanism to model inter-UAV communications and aggregate extra neighboring information, and utilize a gated recurrent unit for long-term performance. In addition, a stochastic policy can be learned through a maximum-entropy method with an adjustable temperature parameter. Specifically, we design a heuristic reward function to encourage the agents to achieve global cooperation under partial observability. We train the model to convergence and conduct a series of case studies. Evaluation results show statistical significance and that ms-SDRGN outperforms three state-of-the-art DRL baselines in SAG-MCS. Compared with the best-performing baseline, ms-SDRGN improves 29.0% reward and 3.8% CFE score. We also investigate the scalability and robustness of ms-SDRGN towards DRL environments with diverse observation scales or demanding communication conditions.
引用
收藏
页数:20
相关论文
共 52 条
[1]   Efficient cooperative search of smart targets using UAV Swarms [J].
Altshuler, Yaniv ;
Yanovsky, Vladimir ;
Wagner, Israel A. ;
Bruckstein, Alfred M. .
ROBOTICA, 2008, 26 (551-557) :551-557
[2]  
Altshuler Y, 2018, STUD COMPUT INTELL, V729, P1, DOI 10.1007/978-3-319-63604-7_1
[3]  
Altshuler Y, 2018, STUD COMPUT INTELL, V729, P207, DOI 10.1007/978-3-319-63604-7_8
[4]   Static and expanding grid coverage with ant robots: Complexity results [J].
Altshuler, Yaniv ;
Bruckstein, Alfred M. .
THEORETICAL COMPUTER SCIENCE, 2011, 412 (35) :4661-4674
[5]   Flying Ad-Hoc Networks (FANETs): A survey [J].
Bekmezci, Ilker ;
Sahingoz, Ozgur Koray ;
Temel, Samil .
AD HOC NETWORKS, 2013, 11 (03) :1254-1270
[6]  
Cappiello AG, 2019, 2019 INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS 2019), DOI [10.1109/COMST.2019.2914030, 10.1109/isscs.2019.8801767]
[7]  
Cho K., 2014, PROC 8 WORKSHOP SYNT, P103, DOI DOI 10.3115/V1/W14-4012
[8]  
Chung J, 2014, ARXIV
[9]  
Dai AN, 2020, INT CONF WIRE COMMUN, P1106, DOI [10.1109/WCSP49889.2020.9299760, 10.1109/wcsp49889.2020.9299760]
[10]   An Introduction to Deep Reinforcement Learning [J].
Francois-Lavet, Vincent ;
Henderson, Peter ;
Islam, Riashat ;
Bellemare, Marc G. ;
Pineau, Joelle .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2018, 11 (3-4) :219-354