Dual-Stream Feature Aggregation Network for Unmanned Aerial Vehicle Aerial Images Semantic Segmentation

被引：0

作者：

Li Runzeng ^{[1
]}

Shi Zaifeng ^{[1
,3
]}

Kong Fanning ^{[1
]}

Zhao Xiangyang ^{[1
]}

Luo Tao ^{[2
]}

机构：

[1] Tianjin Univ, Sch Microelect, Tianjin 300072, Peoples R China

[2] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300072, Peoples R China

[3] Tianjin Key Lab Imaging & Sensing Microelect Tech, Tianjin 300072, Peoples R China

来源：

LASER & OPTOELECTRONICS PROGRESS | 2023年 / 60卷 / 24期

关键词：

semantic segmentation; feature aggregation; dual-stream architecture; coordinate attention atrous spatial pyramid pooling; multi-scale feature extraction;

D O I：

10.3788/LOP230955

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Large object size difference in unmanned aerial vehicle (UAV) aerial photography makes it difficult to take into account the segmentation effect of objects of different sizes in the receptive field. A dual-stream feature aggregation network (DSFA-Net) with two branches to extract low-level and high-level features separately, is proposed for such problems. In the encoder, a low-level information extraction branch with three serial ConvNeXt modules is used to preserve more low-level features by generating more channels of features. In the deep feature branch, the coordinate attention atrous spatial pyramid pooling (CA-ASPP) module reassigns weights to feature maps in the channel dimension. It makes the module focus on segmentation objects of different sizes and deep-level multi-scale features are obtained. During the decoding process, the bilateral guided aggregation module performs resolution aggregation between the low-level and deep-level features. Our method is evaluated on the AeroScapes and Semantic Drone datasets, the mean intersection over union is 83.16% and 72.09% respectively, and the mean pixel accuracy is 90.75% and 80.34% respectively. The proposed method is more capable of segmenting objects with large difference sizes compared to mainstream methods. It is suitable for semantic segmentation tasks for UAV aerial images.

引用

页数：9

共 29 条

[1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[2]

Chen LC, 2017, Arxiv, DOI arXiv:1706.05587

[3] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[5] Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation [J].

Cheng, Bowen ;

Collins, Maxwell D. ;

Zhu, Yukun ;

Liu, Ting ;

Huang, Thomas S. ;

Adam, Hartwig ;

Chen, Liang-Chieh .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12472-12482

[6] Xception: Deep Learning with Depthwise Separable Convolutions [J].

Chollet, Francois .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807

[7] Progressive Semantic Segmentation [J].

Chuong Huynh ;

Anh Tuan Tran ;

Khoa Luu ;

Minh Hoai .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :16750-16759

[8] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[9] Coordinate Attention for Efficient Mobile Network Design [J].

Hou, Qibin ;

Zhou, Daquan ;

Feng, Jiashi .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13708-13717

[10] Searching for MobileNetV3 [J].

Howard, Andrew ;

Sandler, Mark ;

Chu, Grace ;

Chen, Liang-Chieh ;

Chen, Bo ;

Tan, Mingxing ;

Wang, Weijun ;

Zhu, Yukun ;

Pang, Ruoming ;

Vasudevan, Vijay ;

Le, Quoc V. ;

Adam, Hartwig .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1314-1324

← 1 2 3 →