LOANet: a lightweight network using object attention for extracting buildings and roads from UAV aerial remote sensing images

被引:9
作者
Han, Xiaoxiang [1 ,2 ]
Liu, Yiman [3 ,4 ]
Liu, Gang [5 ]
Lin, Yuanjie [2 ]
Liu, Qiaohong [1 ]
机构
[1] Shanghai Univ Med & Hlth Sci, Sch Med Instruments, Shanghai, Peoples R China
[2] Univ Shanghai Sci & Technol, Sch Hlth Sci & Engn, Shanghai, Peoples R China
[3] Shanghai Jiao Tong Univ, Shanghai Childrens Med Ctr, Sch Med, Dept Pediat Cardiol, Shanghai, Peoples R China
[4] East China Normal Univ, Sch Commun & Elect Engn, Shanghai Key Lab Multidimens Informat Proc, Shanghai, Peoples R China
[5] China Earthquake Adm, Inst Seismol, Key Lab Earthquake Geodesy, Wuhan, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
Remote sensing image; Semantic segmentation; Context features; Lightweight network; Object attention;
D O I
10.7717/peerj-cs.1467
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic segmentation for extracting buildings and roads from uncrewed aerial vehicle (UAV) remote sensing images by deep learning becomes a more efficient and convenient method than traditional manual segmentation in surveying and mapping fields. In order to make the model lightweight and improve the model accuracy, a lightweight network using object attention (LOANet) for buildings and roads from UAV aerial remote sensing images is proposed. The proposed network adopts an encoder-decoder architecture in which a lightweight densely connected network (LDCNet) is developed as the encoder. In the decoder part, the dual multi-scale context modules which consist of the atrous spatial pyramid pooling module (ASPP) and the object attention module (OAM) are designed to capture more context information from feature maps of UAV remote sensing images. Between ASPP and OAM, a feature pyramid network (FPN) module is used to fuse multi-scale features extracted from ASPP. A private dataset of remote sensing images taken by UAV which contains 2431 training sets, 945 validation sets, and 475 test sets is constructed. The proposed basic model performs well on this dataset, with only 1.4M parameters and 5.48G floating point operations (FLOPs), achieving excellent mean Intersection-over-Union (mIoU). Further experiments on the publicly available LoveDA and CITY-OSM datasets have been conducted to further validate the effectiveness of the proposed basic and large model, and outstanding mIoU results have been achieved. All codes are available on https://github.com/GtLinyer/LOANet.
引用
收藏
页数:22
相关论文
共 55 条
  • [1] Ba J. L., 2016, arXiv, DOI DOI 10.48550/ARXIV.1607.06450
  • [2] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    Badrinarayanan, Vijay
    Kendall, Alex
    Cipolla, Roberto
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
  • [3] Deep learning-based multi-feature semantic segmentation in building extraction from images of UAV photogrammetry
    Boonpook, Wuttichai
    Tan, Yumin
    Xu, Bo
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2021, 42 (01) : 1 - 19
  • [4] GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
    Cao, Yue
    Xu, Jiarui
    Lin, Stephen
    Wei, Fangyun
    Hu, Han
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1971 - 1980
  • [5] Chen L-C, 2018, 801818 IEEE
  • [6] Chen LC, 2016, Arxiv, DOI [arXiv:1412.7062, DOI 10.48550/ARXIV.1412.7062]
  • [7] Chen LC, 2017, Arxiv, DOI [arXiv:1706.05587, DOI 10.48550/ARXIV.1706.05587]
  • [8] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [9] Research on a novel extraction method using Deep Learning based on GF-2 images for aquaculture areas
    Cheng, Bo
    Liang, Chenbin
    Liu, Xunan
    Liu, Yueming
    Ma, Xiaoxiao
    Wang, Guizhou
    [J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2020, 41 (09) : 3575 - 3591
  • [10] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
    Ding, Xiaohan
    Zhang, Xiangyu
    Han, Jungong
    Ding, Guiguang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11953 - 11965