PTMB: An online satellite task scheduling framework based on pre-trained Markov decision process for multi-task scenario

被引:9
作者
Li, Guohao [1 ]
Li, Xuefei [1 ]
Li, Jing [1 ]
Chen, Jia [1 ]
Shen, Xin [2 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Bayi Rd, Wuhan 430072, Hubei, Peoples R China
[2] Wuhan Univ, State Key Lab Informat Engn Surveying, Mapping & Remote Sensing, Bayi Rd, Wuhan 430072, Hubei, Peoples R China
关键词
Reinforcement learning; Satellite task scheduling; Pre-trained model; Markov decision process; ORDER ACCEPTANCE; SEARCH ALGORITHM;
D O I
10.1016/j.knosys.2023.111339
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ever-growing scale of satellites and the increasing demand for Earth observation have led to significant interest in the problem of online multi -task scheduling for Earth observation satellites. Presently, task scheduling mostly relies on meta-heuristic and reinforcement learning algorithms. However, meta-heuristic algorithms possess slow convergence speed and are not suitable for multi -task scheduling scenarios, while reinforcement learning algorithms are unable to ensure solution quality due to unstable environmental states. In this study, we propose an innovative online micro -batch scheduling framework based on pre-trained reinforcement learning model (PTMB). This framework splits the satellite scheduling issue into two phases: task decision-making and task allocation. We leverage the micro -batch processing mode and introduce a pretrained Markov decision model during the task decision-making phase. Additionally, we incorporate resource pre -allocation, task sequencing, task order shuffle, and other strategies to enhance the overall solution quality. Simulation experiments reveal a significant enhancement in performance of our proposed method. Specifically, when dealing with task sizes of 100 and 300, the task scheduling reward of the proposed framework surpasses that of the improved genetic algorithm by 11.5% and 0.4%, respectively, while reducing time consumption by 97.8% and 99.3%. Furthermore, our framework surpasses online reinforcement learning scheduling method, which is also based on Markov decision process model, achieving improvements of 55.1%, 5.6%, and 15.6% in task scheduling reward and reductions in time consumed by 94.9%, 96.7%, and 99.1% at task sizes of 100, 300, and 500, respectively.
引用
收藏
页数:13
相关论文
共 41 条
[1]   Learning from multimodal and multitemporal earth observation data for building damage mapping [J].
Adriano, Bruno ;
Yokoya, Naoto ;
Xia, Junshi ;
Miura, Hiroyuki ;
Liu, Wen ;
Matsuoka, Masashi ;
Koshimura, Shunichi .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 175 :132-143
[2]   Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark [J].
Armbrust, Michael ;
Das, Tathagata ;
Torres, Joseph ;
Yavuz, Burak ;
Zhu, Shixiong ;
Xin, Reynold ;
Ghodsi, Ali ;
Stoica, Ion ;
Zaharia, Matei .
SIGMOD'18: PROCEEDINGS OF THE 2018 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2018, :601-613
[3]  
Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
[4]  
Baart R., 2018, Master's Thesis
[5]   Planning and scheduling algorithms for the COSMO-SkyMed constellation [J].
Bianchessi, Nicola ;
Righini, Giovanni .
AEROSPACE SCIENCE AND TECHNOLOGY, 2008, 12 (07) :535-544
[6]   A heuristic for the multi-satellite, multi-orbit and multi-user management of Earth observation satellites [J].
Bianchessi, Nicola ;
Cordeau, Jean-Francois ;
Desrosiers, Jacques ;
Laporte, Gilbert ;
Raymond, Vincent .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 177 (02) :750-762
[7]   A tabu search algorithm for order acceptance and scheduling [J].
Cesaret, Bahriye ;
Oguz, Ceyda ;
Salman, F. Sibel .
COMPUTERS & OPERATIONS RESEARCH, 2012, 39 (06) :1197-1205
[8]   Hybrid evolutionary approaches for the single machine order acceptance and scheduling problem [J].
Chaurasia, Sachchida Nand ;
Singh, Alok .
APPLIED SOFT COMPUTING, 2017, 52 :725-747
[9]   Coordinate scheduling approach for EDS observation tasks and data transmission jobs [J].
Chen, Hao ;
Wu, Jiangjiang ;
Shi, Wenyuan ;
Li, Jun ;
Zhong, Zhinong .
JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2016, 27 (04) :822-835
[10]   Event-Triggered Deep Reinforcement Learning for Dynamic Task Scheduling in Multisatellite Resource Allocation [J].
Cui, Kaixin ;
Song, Jiliang ;
Zhang, Lei ;
Tao, Ying ;
Liu, Wei ;
Shi, Dawei .
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2023, 59 (04) :3766-3777