A Vision-Based End-to-End Reinforcement Learning Framework for Drone Target Tracking

被引：0

作者：

Zhao, Xun ^{[1
]}

Huang, Xinjian ^{[2
]}

Cheng, Jianheng ^{[3
]}

Xia, Zhendong ^{[3
]}

Tu, Zhiheng ^{[1
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Mech Engn, Nanjing 210094, Peoples R China

[2] Nanjing Univ Sci & Technol, Sch Cyber Sci & Engn, Nanjing 210094, Peoples R China

[3] Nanjing Univ Sci & Technol, Sch Sino French Engineers, Nanjing 210094, Peoples R China

来源：

DRONES | 2024年 / 8卷 / 11期

关键词：

drone target tracking; end to end; reinforcement learning; YOLOv8; detector; BoT-SORT; twin delayed deep deterministic policy gradient;

D O I：

10.3390/drones8110628

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

Drone target tracking, which involves instructing drone movement to follow a moving target, encounters several challenges: (1) traditional methods need accurate state estimation of both the drone and target; (2) conventional Proportional-Derivative (PD) controllers require tedious parameter tuning and struggle with nonlinear properties; and (3) reinforcement learning methods, though promising, rely on the drone's self-state estimation, adding complexity and computational load and reducing reliability. To address these challenges, this study proposes an innovative model-free end-to-end reinforcement learning framework, the VTD3 (Vision-Based Twin Delayed Deep Deterministic Policy Gradient), for drone target tracking tasks. This framework focuses on controlling the drone to follow a moving target while maintaining a specific distance. VTD3 is a pure vision-based tracking algorithm which integrates the YOLOv8 detector, the BoT-SORT tracking algorithm, and the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. It diminishes reliance on GPS and other sensors while simultaneously enhancing the tracking capability for complex target motion trajectories. In a simulated environment, we assess the tracking performance of VTD3 across four complex target motion trajectories (triangular, square, sawtooth, and square wave, including scenarios with occlusions). The experimental results indicate that our proposed VTD3 reinforcement learning algorithm substantially outperforms conventional PD controllers in drone target tracking applications. Across various target trajectories, the VTD3 algorithm demonstrates a significant reduction in average tracking errors along the X-axis and Y-axis of up to 34.35% and 45.36%, respectively. Additionally, it achieves a notable improvement of up to 66.10% in altitude control precision. In terms of motion smoothness, the VTD3 algorithm markedly enhances performance metrics, with improvements of up to 37.70% in jitter and 60.64% in Jerk RMS. Empirical results verify the superiority and feasibility of our proposed VTD3 framework for drone target tracking.

引用

页数：25

共 42 条

[1] Intrinsically Safe Drone Propulsion System for Underground Coal Mining Applications: Computational and Experimental Studies [J].

Aboelezz, Ahmed ;

Wetz, David ;

Lehr, Jane ;

Roghanchi, Pedram ;

Hassanalian, Mostafa .

DRONES, 2023, 7 (01)

[2]

Aharon N, 2022, Arxiv, DOI [arXiv:2206.14651, DOI 10.48550/ARXIV.2206.14651, 10.48550/arXiv.2206.14651]

[3]

Airlines B.D., Sigma Free Project

[4]

Ajmera Y, 2020, IEEE INT SYMP SAFE, P15, DOI [10.1109/ssrr50563.2020.9292630, 10.1109/SSRR50563.2020.9292630]

[5] Dos and Don'ts of using drone technology in the crop fields [J].

Aliloo, Jamileh ;

Abbasi, Enayat ;

Karamidehkordi, Esmail ;

Parmehr, Ebadat Ghanbari ;

Canavari, Maurizio .

TECHNOLOGY IN SOCIETY, 2024, 76

[6] Applications of unmanned aerial vehicles in radiological monitoring: A review [J].

Ardiny, Hadi ;

Beigzadeh, Amirmohammad ;

Mahani, Hojjat .

NUCLEAR ENGINEERING AND DESIGN, 2024, 422

[7]

Cheng Yuhu, 2024, IEEE Transactions on Artificial Intelligence, V5, P3915, DOI [10.1109/tai.2024.3354694, 10.1109/TAI.2024.3354694]

[8]

Chua K, 2018, ADV NEUR IN, V31

[9] Visual-GPS combined 'follow-me' tracking for selfie drones [J].

Do, T. Tuan ;

Ahn, Heejune .

ADVANCED ROBOTICS, 2018, 32 (19) :1047-1060

[10] Cooperative pursuit of unauthorized UAVs in urban airspace via Multi-agent reinforcement learning [J].

Du, Wenbo ;

Guo, Tong ;

Chen, Jun ;

Li, Biyue ;

Zhu, Guangxiang ;

Cao, Xianbin .

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 128

← 1 2 3 4 5 →