A Lightweight Action Recognition Method for Unmanned-Aerial-Vehicle Video

被引:9
作者
Ding, Meng [1 ]
Li, Ning [1 ]
Song, Ziang [1 ]
Zhang, Ruixing [1 ]
Zhang, Xiaxia [1 ]
Zhou, Huiyu [2 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Elect & Informat Engn, Nanjing, Peoples R China
[2] Univ Leicester, Sch Informat, Leicester LE1 7RH, Leics, England
来源
2020 IEEE THE 3RD INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION ENGINEERING (ICECE) | 2020年
关键词
action recognition; UAV; MobileNetV3; self-attention;
D O I
10.1109/ICECE51594.2020.9353008
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent year, due to motility and wide coverage, unmanned aerial vehicle (UAV) has been widely applied in surveillance system. Human action recognition in UAV video is essential for surveillance video understanding. However, existing action recognition methods suffer from heavy computing, which makes it hard to deploy in real applications. In this paper, a lightweight action recognition method for UAV video(LARMUV) is proposed. This method is based on TSN and adopt MobileNetV3 as backbone, which greatly reduces amount of computing and parameters. Self-attention mechanism is adopted to capture temporal structure among different frames. For loss function, Focal Loss is used to putting more focus on hard, misclassified examples. Last but not least, knowledge distillation is employed to enhance the performance of our model, which transfer knowledge from a larger teacher model to student model. Experimental results on HMDB51, UCF101 and UAV dataset show that our method can achieve competitive performance compared to baseline methods while run in real-time mode.
引用
收藏
页码:181 / 185
页数:5
相关论文
共 16 条
  • [1] [Anonymous], 2017, ARXIV170404861
  • [2] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [3] Cheng Y, 2017, ARXIV171009282
  • [4] Learning Spatiotemporal Features with 3D Convolutional Networks
    Du Tran
    Bourdev, Lubomir
    Fergus, Rob
    Torresani, Lorenzo
    Paluri, Manohar
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4489 - 4497
  • [5] SlowFast Networks for Video Recognition
    Feichtenhofer, Christoph
    Fan, Haoqi
    Malik, Jitendra
    He, Kaiming
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6201 - 6210
  • [6] Heng Wang, 2011, 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), P3169, DOI 10.1109/CVPR.2011.5995407
  • [7] Hinton G., 2015, ARXIV
  • [8] Searching for MobileNetV3
    Howard, Andrew
    Sandler, Mark
    Chu, Grace
    Chen, Liang-Chieh
    Chen, Bo
    Tan, Mingxing
    Wang, Weijun
    Zhu, Yukun
    Pang, Ruoming
    Vasudevan, Vijay
    Le, Quoc V.
    Adam, Hartwig
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1314 - 1324
  • [9] Focal Loss for Dense Object Detection
    Lin, Tsung-Yi
    Goyal, Priya
    Girshick, Ross
    He, Kaiming
    Dollar, Piotr
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2999 - 3007
  • [10] Ng JYH, 2015, PROC CVPR IEEE, P4694, DOI 10.1109/CVPR.2015.7299101