Deep Attention Neural Network for Multi-Label Classification in Unmanned Aerial Vehicle Imagery

被引:28
作者
Alshehri, Aaliyah [1 ]
Bazi, Yakoub [1 ]
Ammour, Nassim [1 ]
Almubarak, Haidar [1 ]
Alajlan, Naif [1 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Comp Engn Dept, Riyadh 11543, Saudi Arabia
关键词
UAV imagery; deep learning; attention neural network; multi-label image classification; LEARNING APPROACH;
D O I
10.1109/ACCESS.2019.2936616
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The multi-label classification problem in Unmanned Aerial Vehicle (UAV) images is particularly challenging compared to single-label classification due to its combinatorial nature. To tackle this issue, we propose in this paper a deep learning approach based on encoder-decoder neural network architecture with channel and spatial attention mechanisms. Specifically, the encoder module which is based on a pre-trained convolutional neural network (CNN) has the task to transform the input image to a set of feature maps using an opportune feature combination. To improve the feature representation further, this module incorporates a squeeze excitation (SE) layer for modelling the interdependencies between the channels of the feature maps. The decoder module which is based on a long short terms memory (LSTM) network has the task of generating, in a sequential way, the classes present in the image. At each time step, it predicts the next class-label by aligning its hidden state to the corresponding region in the image by means of an adaptive spatial attention mechanism. The experiments carried out on two UAV datasets with a spatial resolution of 2-cm show that our method is promising in predicting the labels present in the image while attending the relevant objects in the image. Additionally, it is able to provide better classification results compared to state-of-the-art methods.
引用
收藏
页码:119873 / 119880
页数:8
相关论文
共 29 条
[1]  
[Anonymous], 2016, P 2016 ACM MULT C AC
[2]  
[Anonymous], IEEE T PATTERN ANAL
[3]  
[Anonymous], 2016, IEEE TPAMI, DOI DOI 10.1109/TPAMI.2015.2491929
[4]  
[Anonymous], 2018, IEEE T NEUR NET LEAR, DOI DOI 10.1109/TNNLS.2017.2705222
[5]  
[Anonymous], 2017, CVPR
[6]  
[Anonymous], ADV NEURAL INFORM PR
[7]  
[Anonymous], 2013, ARXIV13124894
[8]   Learning multi-label scene classification [J].
Boutell, MR ;
Luo, JB ;
Shen, XP ;
Brown, CM .
PATTERN RECOGNITION, 2004, 37 (09) :1757-1771
[9]   Deep Robust Encoder Through Locality Preserving Low-Rank Dictionary [J].
Ding, Zhengming ;
Shao, Ming ;
Fu, Yun .
COMPUTER VISION - ECCV 2016, PT VI, 2016, 9910 :567-582
[10]   Multilabel classification via calibrated label ranking [J].
Fuernkranz, Johannes ;
Huellermeier, Eyke ;
Mencia, Eneldo Loza ;
Brinker, Klaus .
MACHINE LEARNING, 2008, 73 (02) :133-153