Learning to construct a solution for UAV path planning problem with positioning error correction

被引：5

作者：

Chun, Jie ^{[1
]}

Chen, Ming ^{[1
]}

Liu, Xiaolu ^{[1
]}

Xiang, Shang ^{[2
]}

Du, Yonghao ^{[1
]}

Wu, Guohua ^{[3
]}

Xing, Lining ^{[4
]}

机构：

[1] Natl Univ Def Technol, Coll Syst Engn, Changsha 410073, Hunan, Peoples R China

[2] XiangTan Univ, Sch Publ Adm, Xiangtan 411100, Hunan, Peoples R China

[3] Cent South Univ, Sch Automat, Changsha 410075, Hunan, Peoples R China

[4] Xidian Univ, Coll Elect Engn, Xian 710126, Shanxi, Peoples R China

来源：

KNOWLEDGE-BASED SYSTEMS | 2024年 / 304卷

基金：

中国国家自然科学基金;

关键词：

Deep reinforcement learning; UAV; Path planning; Positioning error correction; Reinforcement learning algorithm; PARTICLE SWARM OPTIMIZATION; ALGORITHM; VEHICLE;

D O I：

10.1016/j.knosys.2024.112569

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Unmanned aerial vehicles (UAVs) are advanced flight systems. However, their positioning systems cause distance-dependent errors during flight. This study seeks to solve the UAV path planning problem with positioning error correction (UPEC) with an end-to-end method. Traditional methods struggle to balance solution quality and computational overload, and often have limited utilisation of scenario information. To overcome these issues, we propose a path planning model (PPM) based on deep reinforcement learning to solve the UPEC. The model has a complete structure that includes a mathematical model, feature engineering, solution process, neural policy network, scenario generation, training process, and test solution mechanism. Specifically, we first establish a Markov decision process (MDP) for UPEC and apply feature engineering with effective features to support decision-making. We then introduce a path planning neural network (PPNN) to represent the MDP policy. Based on the dataset generated from the multi-rule combination validation, we train the PPNN using the proposed RL algorithm with storage pool. Furthermore, we propose a backtracking mechanism to guarantee solution feasibility during the construction process. Extensive experiments demonstrate that the proposed PPM outperforms existing state-of-the-art algorithms in terms of solution quality and timeliness, and the backtracking mechanism effectively improves the scenario completion rate. The model study indicates the efficacy of our training algorithm and the generalisation of the PPNN. Additionally, our construction process is problem-tailored and more suitable for addressing UPEC than iterative search algorithms, because it effectively mitigates the impact of invalid nodes.

引用

页数：12

共 44 条

[1]

Ahmed S., 2016, IEEE WIRELESS COMMUN

[2] Feature selection-based decision model for UAV path planning on rough terrains [J].

Ali, Hub ;

Xiong, Gang ;

Haider, Muhammad Husnain ;

Tamir, Tariku Sinshaw ;

Dong, Xisong ;

Shen, Zhen .

EXPERT SYSTEMS WITH APPLICATIONS, 2023, 232

[3] Path planning of multiple UAVs using MMACO and DE algorithm in dynamic environment [J].

Ali, Zain Anwar ;

Han, Zhangang ;

Di, Zhengru .

MEASUREMENT & CONTROL, 2023, 56 (3-4) :459-469

[4] Heuristic and Genetic Algorithm Approaches for UAV Path Planning under Critical Situation [J].

Arantes, Jesimar da Silva ;

Arantes, Marcio da Silva ;

Motta Toledo, Claudio Fabiano ;

Trindade Junior, Onofre ;

Williams, Brian Charles .

INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (01)

[5] Multi-UAV Cooperative Trajectory Planning Based on Many-Objective Evolutionary Algorithm [J].

Bai H. ;

Fan T. ;

Niu Y. ;

Cui Z. .

Complex System Modeling and Simulation, 2022, 2 (02) :130-141

[6] Landmark selection and path planning for unmanned vehicles with position error corrections [J].

Bao, Zhaoyao ;

Zhang, Ziyu ;

Xie, Chi .

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2023, 153

[7]

Bello Irwan, 2016, INT C LEARN REPR

[8]

Cekmez U, 2016, INT CONF UNMAN AIRCR, P47, DOI 10.1109/ICUAS.2016.7502621

[9] Learning to Construct a Solution for the Agile Satellite Scheduling Problem With Time-Dependent Transition Times [J].

Chen, Ming ;

Du, Yonghao ;

Tang, Ke ;

Xing, Lining ;

Chen, Yuning ;

Chen, Yingwu .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (10) :5949-5963

[10] Heuristic algorithms based on deep reinforcement learning for quadratic unconstrained binary optimization [J].

Chen, Ming ;

Chen, Yuning ;

Du, Yonghao ;

Wei, Luona ;

Chen, Yingwu .

KNOWLEDGE-BASED SYSTEMS, 2020, 207

← 1 2 3 4 5 →