A Multiple-Instance Densely-Connected ConvNet for Aerial Scene Classification

被引:113
作者
Bi, Qi [1 ]
Qin, Kun [1 ]
Li, Zhili [2 ]
Zhang, Han [1 ]
Xu, Kai [2 ]
Xia, Gui-Song [3 ,4 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
[2] China Univ Geosci, Sch Geog & Informat Engn, Wuhan 430074, Peoples R China
[3] Wuhan Univ, Sch Comp Sci, Wuhan 430079, Peoples R China
[4] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan 430079, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Semantics; Machine learning; Task analysis; Training; Neural networks; Visualization; Scene classification; convolutional neural network; multiple instance learning; dense connection; aerial image; CONVOLUTIONAL NEURAL-NETWORKS; IMAGE; SCALE; EFFICIENT; FEATURES; FUSION; MODEL; COLOR;
D O I
10.1109/TIP.2020.2975718
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In contrast with nature scenes, aerial scenes are often composed of many objects crowdedly distributed on the surface in bird's view, the description of which usually demands more discriminative features as well as local semantics. However, when applied to scene classification, most of the existing convolution neural networks (ConvNets) tend to depict global semantics of images, and the loss of low- and mid-level features can hardly be avoided, especially when the model goes deeper. To tackle these challenges, in this paper, we propose a multiple-instance densely-connected ConvNet (MIDC-Net) for aerial scene classification. It regards aerial scene classification as a multiple-instance learning problem so that local semantics can be further investigated. Our classification model consists of an instance-level classifier, a multiple instance pooling and followed by a bag-level classification layer. In the instance-level classifier, we propose a simplified dense connection structure to effectively preserve features from different levels. The extracted convolution features are further converted into instance feature vectors. Then, we propose a trainable attention-based multiple instance pooling. It highlights the local semantics relevant to the scene label and outputs the bag-level probability directly. Finally, with our bag-level classification layer, this multiple instance learning framework is under the direct supervision of bag labels. Experiments on three widely-utilized aerial scene benchmarks demonstrate that our proposed method outperforms many state-of-the-art methods by a large margin with much fewer parameters.
引用
收藏
页码:4911 / 4926
页数:16
相关论文
共 67 条
  • [1] [Anonymous], P INT C LEARN REPR
  • [2] [Anonymous], P INT C MACH LEARN
  • [3] Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification
    Anwer, Rao Muhammad
    Khan, Fahad Shahbaz
    van de Weijer, Joost
    Molinier, Matthieu
    Laaksonen, Jorma
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 138 : 74 - 85
  • [4] Block-based semantic classification of high-resolution multispectral aerial images
    Avramovic, Aleksej
    Risojevic, Vladimir
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (01) : 75 - 84
  • [5] Bi Q, 2019, IEEE IMAGE PROC, P2501, DOI [10.1109/icip.2019.8803322, 10.1109/ICIP.2019.8803322]
  • [6] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [7] Perspective-SIFT: An efficient tool for low-altitude remote sensing image registration
    Cai, Guo-Rong
    Jodoin, Pierre-Marc
    Li, Shao-Zi
    Wu, Yun-Dong
    Su, Song-Zhi
    Huang, Zhen-Kun
    [J]. SIGNAL PROCESSING, 2013, 93 (11) : 3088 - 3110
  • [8] SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning
    Chen, Long
    Zhang, Hanwang
    Xiao, Jun
    Nie, Liqiang
    Shao, Jian
    Liu, Wei
    Chua, Tat-Seng
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6298 - 6306
  • [9] When Deep Learning Meets Metric Learning: Remote Sensing Image Scene Classification via Learning Discriminative CNNs
    Cheng, Gong
    Yang, Ceyuan
    Yao, Xiwen
    Guo, Lei
    Han, Junwei
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (05): : 2811 - 2821
  • [10] Remote Sensing Image Scene Classification: Benchmark and State of the Art
    Cheng, Gong
    Han, Junwei
    Lu, Xiaoqiang
    [J]. PROCEEDINGS OF THE IEEE, 2017, 105 (10) : 1865 - 1883