A Navigation Algorithm Based on the Reinforcement Learning Reward System and Optimised with Genetic Algorithm

被引：0

作者：

Cabezas-Olivenza, Mireya ^{[1
]}

Zulueta, Ekaitz ^{[2
]}

Azurmendi-Marquinez, Iker ^{[3
]}

Fernandez-Gamiz, Unai ^{[4
]}

Rico-Melgosa, Danel ^{[2
]}

机构：

[1] Mondragon Unibertsitatea, Fac Engn, Arrasate Mondragon 20500, Spain

[2] Univ Basque Country UPV EHU, Syst Engn & Automat Control Dept, Nieves Cano 12, Vitoria 01006, Spain

[3] CS Ctr Stirling S Coop, Avda Alava 3, Aretxabaleta 20550, Spain

[4] Univ Basque Country UPV EHU, Dept Energy Engn, Nieves Cano 12, Vitoria 01006, Spain

来源：

MATHEMATICS | 2024年 / 12卷 / 24期

关键词：

navigation; reinforcement learning; genetic algorithm; optimisation; autonomous vehicle; q-learning; AGV; MOBILE ROBOT NAVIGATION;

D O I：

10.3390/math12244030

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Regarding autonomous vehicle navigation, reinforcement learning is a technique that has demonstrated significant results. Nevertheless, it is a technique with a high number of parameters that need to be optimised without prior information, and correctly performing this is a complicated task. In this research study, a system based on the principles of reinforcement learning, specifically on the concept of rewards, is presented. A mathematical expression was proposed to control the vehicle's direction based on its position, the obstacles in the environment and the destination. In this equation proposal, there was only one unknown parameter that regulated the degree of the action to be taken, and this was optimised through the genetic algorithm. In this way, a less computationally expensive navigation algorithm was presented, as it avoided the use of neural networks. The controller's time to obtain the navigation instructions was around 6.201<middle dot>10-4 s. This algorithm is an efficient and accurate system which manages not to collide with obstacles and to reach the destination from any position. Moreover, in most cases, it has been found that the proposed navigations are also optimal.

引用

页数：25

共 46 条

[1] Van N.T.T., Tien N.M., Cuong N.M., Duyen H.T.K., Duy N.D., Constructing an Intelligent Navigation System for Autonomous Mobile Robot Based on Deep Reinforcement Learning, Studies in Computational Intelligence, 981, pp. 251-261, (2021)
[2] Sadhukhan P., Selmic R.R., Proximal Policy Optimization for Formation Navigation and Obstacle Avoidance, Int. J. Intell. Robot. Appl, 6, pp. 746-759, (2022)
[3] Toan N., Woo K.G., Mapless Navigation with Deep Reinforcement Learning Based on the Convolutional Proximal Policy Optimization Network, Proceedings of the International Conference on Big Data and Smart Computing (BIGCOMP)
[4] Surmann H., Jestel C., Marchel R., Musberg F., Elhadj H., Ardani M., Deep Reinforcement Learning for Real Autonomous Mobile Robot Navigation in Indoor Environments, arXiv, (2020)
[5] Jesus J.C., Bottega J.A., Cuadros M.A.S.L., Gamarra D.F.T., Deep Deterministic Policy Gradient for Navigation of Mobile Robots in Simulated Environments, (2019)
[6] Staroverov A., Panov A.I., Landmark Policy Optimization for Object Navigation Task, arXiv, (2021)
[7] Staroverov A., Panov A., Hierarchical Landmark Policy Optimization for Visual Indoor Navigation, IEEE Access, 10, pp. 70447-70455, (2022)
[8] Ben Hazem Z., Study of Q-Learning and Deep Q-Network Learning Control for a Rotary Inverted Pendulum System, Discov. Appl. Sci, 6, (2024)
[9] Khriji L., Touati F., Benhmed K., Al-Yahmedi A., undefined Mobile Robot Navigation Based on Q-Learning Technique, Int. J. Adv. Robot. Syst, 8, pp. 45-51, (2011)
[10] Ribeiro T., Goncalves F., Garcia I., Lopes G., Ribeiro A.F., Q-Learning for Autonomous Mobile Robot Obstacle Avoidance, Proceedings of the IEEE International Conference on Autonomous Robot Systems and Competitions

← 1 2 3 4 5 →