Aircraft collision avoidance modeling and optimization using deep reinforcement learning

被引：2

作者：

Park K.-W. ^{[1
]}

Kim J.-H. ^{[2
]}

机构：

[1] AI Lab, Deltax Co., Ltd

来源：

Journal of Institute of Control, Robotics and Systems | 2021年 / 27卷 / 09期

关键词：

Collision avoidance; Imitation learning; Machine learning; Optimization; Reinforcement learning;

D O I：

10.5302/J.ICROS.2021.21.0034

中图分类号：

学科分类号：

摘要：

We propose an imitation-type reinforcement learning approach for aircraft collision avoidance problems. The policy model is initially supervised to learn the collision avoidance strategies based on the domain-knowledge from the flight mechanics and the guidance contexts, and then it is updated and optimized via reinforcement learning and the proximal policy optimization. The performance of the proposed approach was verified via Monte-Carlo simulation runs that contain a wide range of collision geometries. © ICROS 2021.

引用

页码：652 / 659

页数：7

共 12 条

[1]

Hwang Y.K., Ahuja N., A potential field approach to path planning, IEEE Transactions on Robotics and Automation, 8, 1, (1992)

[2]

Han S.C., Bang H.C., Proportional navigation-based optimal collision avoidance for UAVs, Journal of Institute of Control, Robotics and Systems (In Korean), 10, 11, pp. 1065-1070, (2004)

[3]

Chen Y.F., Liu M., Everett M., How J.P., Decentra-Lized Non-Communicating Multiagent Collision Avoidance with Deep Reinforcement Learning

[4]

Park S.G., Kim D.H., Autonomous flying of drone based on PPO reinforcement learning algorithm, Journal of Institute of Control, Robotics and Systems (In Korean), 26, 11, pp. 955-963

[5]

Kim M., Kim J., Jung M., Oh H., Collision avoidance for a small drone with monocular camera using deep reinforcement learning in an indoor environment, Journal of Institute of Control, Robotics and Systems (In Korean), 26, 6, pp. 399-411

[6]

Tesauro G., Practical issues in temporal difference learn-ing, Machine Learning, 8, pp. 257-277, (1992)

[7]

Mnih V., Badia A.P., Mirza M., Graves A., Lillicarap T.P., Harley T., Silver D., Kavukcuoglu K., Asynchronous methods for deep reinforcement learning, ICML, (2016)

[8]

Schulman J., Wolski F., Dhariwal P., Radford A., Klimov O., Proximal Policy Optimization Algorithms

[9]

Oh J., Guo Y., Singh S., Lee H., Self-Imitation Learn-Ing

[10]

Kostrikov I., Nachum O., Tompson J., Imitation learn-ing via off-policy distribution matching, ICLR, (2020)

← 1 2 →