A Lightweight Action Recognition Method for Unmanned-Aerial-Vehicle Video

被引：9

作者：

Ding, Meng ^{[1
]}

Li, Ning ^{[1
]}

Song, Ziang ^{[1
]}

Zhang, Ruixing ^{[1
]}

Zhang, Xiaxia ^{[1
]}

Zhou, Huiyu ^{[2
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Elect & Informat Engn, Nanjing, Peoples R China

[2] Univ Leicester, Sch Informat, Leicester LE1 7RH, Leics, England

来源：

2020 IEEE THE 3RD INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION ENGINEERING (ICECE) | 2020年

关键词：

action recognition; UAV; MobileNetV3; self-attention;

D O I：

10.1109/ICECE51594.2020.9353008

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In recent year, due to motility and wide coverage, unmanned aerial vehicle (UAV) has been widely applied in surveillance system. Human action recognition in UAV video is essential for surveillance video understanding. However, existing action recognition methods suffer from heavy computing, which makes it hard to deploy in real applications. In this paper, a lightweight action recognition method for UAV video(LARMUV) is proposed. This method is based on TSN and adopt MobileNetV3 as backbone, which greatly reduces amount of computing and parameters. Self-attention mechanism is adopted to capture temporal structure among different frames. For loss function, Focal Loss is used to putting more focus on hard, misclassified examples. Last but not least, knowledge distillation is employed to enhance the performance of our model, which transfer knowledge from a larger teacher model to student model. Experimental results on HMDB51, UCF101 and UAV dataset show that our method can achieve competitive performance compared to baseline methods while run in real-time mode.

引用

页码：181 / 185

页数：5

共 16 条

[1] [Anonymous], 2017, ARXIV170404861
[2] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Carreira, Joao
Zisserman, Andrew
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
[3] Cheng Y, 2017, ARXIV171009282
[4] Learning Spatiotemporal Features with 3D Convolutional Networks
Du Tran
Bourdev, Lubomir
Fergus, Rob
Torresani, Lorenzo
Paluri, Manohar
[J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
[5] SlowFast Networks for Video Recognition
Feichtenhofer, Christoph
Fan, Haoqi
Malik, Jitendra
He, Kaiming
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6201 - 6210
[6] Heng Wang, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3169, DOI 10.1109/CVPR.2011.5995407
[7] Hinton G., 2015, ARXIV
[8] Searching for MobileNetV3
Howard, Andrew
Sandler, Mark
Chu, Grace
Chen, Liang-Chieh
Chen, Bo
Tan, Mingxing
Wang, Weijun
Zhu, Yukun
Pang, Ruoming
Vasudevan, Vijay
Le, Quoc V.
Adam, Hartwig
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1314 - 1324
[9] Focal Loss for Dense Object Detection
Lin, Tsung-Yi
Goyal, Priya
Girshick, Ross
He, Kaiming
Dollar, Piotr
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2999 - 3007
[10] Ng JYH, 2015, PROC CVPR IEEE, P4694, DOI 10.1109/CVPR.2015.7299101

← 1 2 →