Space-Air-Ground Integrated Mobile Crowdsensing for Partially Observable Data Collection by Multi-Scale Convolutional Graph Reinforcement Learning

被引：4

作者：

Ren, Yixiang ^{[1
]}

Ye, Zhenhui ^{[2
]}

Song, Guanghua ^{[1
]}

Jiang, Xiaohong ^{[2
]}

机构：

[1] Zhejiang Univ, Sch Aeronaut & Astronaut, Hangzhou 310027, Peoples R China

[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310027, Peoples R China

来源：

ENTROPY | 2022年 / 24卷 / 05期

基金：

国家重点研发计划;

关键词：

mobile crowdsensing; deep reinforcement learning; UAV control; graph network; maximum-entropy learning; COOPERATIVE SEARCH; COVERAGE;

D O I：

10.3390/e24050638

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

Mobile crowdsensing (MCS) is attracting considerable attention in the past few years as a new paradigm for large-scale information sensing. Unmanned aerial vehicles (UAVs) have played a significant role in MCS tasks and served as crucial nodes in the newly-proposed space-air-ground integrated network (SAGIN). In this paper, we incorporate SAGIN into MCS task and present a Space-Air-Ground integrated Mobile CrowdSensing (SAG-MCS) problem. Based on multi-source observations from embedded sensors and satellites, an aerial UAV swarm is required to carry out energy-efficient data collection and recharging tasks. Up to date, few studies have explored such multi-task MCS problem with the cooperation of UAV swarm and satellites. To address this multi-agent problem, we propose a novel deep reinforcement learning (DRL) based method called Multi-Scale Soft Deep Recurrent Graph Network (ms-SDRGN). Our ms-SDRGN approach incorporates a multi-scale convolutional encoder to process multi-source raw observations for better feature exploitation. We also use a graph attention mechanism to model inter-UAV communications and aggregate extra neighboring information, and utilize a gated recurrent unit for long-term performance. In addition, a stochastic policy can be learned through a maximum-entropy method with an adjustable temperature parameter. Specifically, we design a heuristic reward function to encourage the agents to achieve global cooperation under partial observability. We train the model to convergence and conduct a series of case studies. Evaluation results show statistical significance and that ms-SDRGN outperforms three state-of-the-art DRL baselines in SAG-MCS. Compared with the best-performing baseline, ms-SDRGN improves 29.0% reward and 3.8% CFE score. We also investigate the scalability and robustness of ms-SDRGN towards DRL environments with diverse observation scales or demanding communication conditions.

引用

页数：20

共 52 条

[1] Efficient cooperative search of smart targets using UAV Swarms [J].

Altshuler, Yaniv ;

Yanovsky, Vladimir ;

Wagner, Israel A. ;

Bruckstein, Alfred M. .

ROBOTICA, 2008, 26 (551-557) :551-557

[2]

Altshuler Y, 2018, STUD COMPUT INTELL, V729, P1, DOI 10.1007/978-3-319-63604-7_1

[3]

Altshuler Y, 2018, STUD COMPUT INTELL, V729, P207, DOI 10.1007/978-3-319-63604-7_8

[4] Static and expanding grid coverage with ant robots: Complexity results [J].

Altshuler, Yaniv ;

Bruckstein, Alfred M. .

THEORETICAL COMPUTER SCIENCE, 2011, 412 (35) :4661-4674

[5] Flying Ad-Hoc Networks (FANETs): A survey [J].

Bekmezci, Ilker ;

Sahingoz, Ozgur Koray ;

Temel, Samil .

AD HOC NETWORKS, 2013, 11 (03) :1254-1270

[6]

Cappiello AG, 2019, 2019 INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS 2019), DOI [10.1109/COMST.2019.2914030, 10.1109/isscs.2019.8801767]

[7]

Cho K., 2014, PROC 8 WORKSHOP SYNT, P103, DOI DOI 10.3115/V1/W14-4012

[8]

Chung J, 2014, ARXIV

[9]

Dai AN, 2020, INT CONF WIRE COMMUN, P1106, DOI [10.1109/WCSP49889.2020.9299760, 10.1109/wcsp49889.2020.9299760]

[10] An Introduction to Deep Reinforcement Learning [J].

Francois-Lavet, Vincent ;

Henderson, Peter ;

Islam, Riashat ;

Bellemare, Marc G. ;

Pineau, Joelle .

FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2018, 11 (3-4) :219-354

← 1 2 3 4 5 6 →