LOANet: a lightweight network using object attention for extracting buildings and roads from UAV aerial remote sensing images

被引：9

作者：

Han, Xiaoxiang ^{[1
,2
]}

Liu, Yiman ^{[3
,4
]}

Liu, Gang ^{[5
]}

Lin, Yuanjie ^{[2
]}

Liu, Qiaohong ^{[1
]}

机构：

[1] Shanghai Univ Med & Hlth Sci, Sch Med Instruments, Shanghai, Peoples R China

[2] Univ Shanghai Sci & Technol, Sch Hlth Sci & Engn, Shanghai, Peoples R China

[3] Shanghai Jiao Tong Univ, Shanghai Childrens Med Ctr, Sch Med, Dept Pediat Cardiol, Shanghai, Peoples R China

[4] East China Normal Univ, Sch Commun & Elect Engn, Shanghai Key Lab Multidimens Informat Proc, Shanghai, Peoples R China

[5] China Earthquake Adm, Inst Seismol, Key Lab Earthquake Geodesy, Wuhan, Hubei, Peoples R China

来源：

PEERJ COMPUTER SCIENCE | 2023年 / 9卷

基金：

中国国家自然科学基金;

关键词：

Remote sensing image; Semantic segmentation; Context features; Lightweight network; Object attention;

D O I：

10.7717/peerj-cs.1467

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semantic segmentation for extracting buildings and roads from uncrewed aerial vehicle (UAV) remote sensing images by deep learning becomes a more efficient and convenient method than traditional manual segmentation in surveying and mapping fields. In order to make the model lightweight and improve the model accuracy, a lightweight network using object attention (LOANet) for buildings and roads from UAV aerial remote sensing images is proposed. The proposed network adopts an encoder-decoder architecture in which a lightweight densely connected network (LDCNet) is developed as the encoder. In the decoder part, the dual multi-scale context modules which consist of the atrous spatial pyramid pooling module (ASPP) and the object attention module (OAM) are designed to capture more context information from feature maps of UAV remote sensing images. Between ASPP and OAM, a feature pyramid network (FPN) module is used to fuse multi-scale features extracted from ASPP. A private dataset of remote sensing images taken by UAV which contains 2431 training sets, 945 validation sets, and 475 test sets is constructed. The proposed basic model performs well on this dataset, with only 1.4M parameters and 5.48G floating point operations (FLOPs), achieving excellent mean Intersection-over-Union (mIoU). Further experiments on the publicly available LoveDA and CITY-OSM datasets have been conducted to further validate the effectiveness of the proposed basic and large model, and outstanding mIoU results have been achieved. All codes are available on https://github.com/GtLinyer/LOANet.

引用

页数：22

共 55 条

[1] Ba J. L., 2016, arXiv, DOI DOI 10.48550/ARXIV.1607.06450
[2] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Badrinarayanan, Vijay
Kendall, Alex
Cipolla, Roberto
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
[3] Deep learning-based multi-feature semantic segmentation in building extraction from images of UAV photogrammetry
Boonpook, Wuttichai
Tan, Yumin
Xu, Bo
[J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2021, 42 (01) : 1 - 19
[4] GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond
Cao, Yue
Xu, Jiarui
Lin, Stephen
Wei, Fangyun
Hu, Han
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1971 - 1980
[5] Chen L-C, 2018, 801818 IEEE
[6] Chen LC, 2016, Arxiv, DOI [arXiv:1412.7062, DOI 10.48550/ARXIV.1412.7062]
[7] Chen LC, 2017, Arxiv, DOI [arXiv:1706.05587, DOI 10.48550/ARXIV.1706.05587]
[8] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[9] Research on a novel extraction method using Deep Learning based on GF-2 images for aquaculture areas
Cheng, Bo
Liang, Chenbin
Liu, Xunan
Liu, Yueming
Ma, Xiaoxiao
Wang, Guizhou
[J]. INTERNATIONAL JOURNAL OF REMOTE SENSING, 2020, 41 (09) : 3575 - 3591
[10] Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Ding, Xiaohan
Zhang, Xiangyu
Han, Jungong
Ding, Guiguang
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11953 - 11965

← 1 2 3 4 5 6 →