Hierarchical Shared Architecture Search for Real-Time Semantic Segmentation of Remote Sensing Images

被引:3
作者
Wang, Wenna [1 ,2 ]
Ran, Lingyan [1 ,2 ]
Yin, Hanlin [1 ,2 ]
Sun, Mingjun [1 ,2 ]
Zhang, Xiuwei [1 ,2 ]
Zhang, Yanning [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710072, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金;
关键词
Feature aggregation module; hierarchical shared search strategy; neural architecture search (NAS); real-time semantic segmentation; NETWORK;
D O I
10.1109/TGRS.2024.3373493
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Real-time semantic segmentation of remote-sensing images demands a trade-off between speed and accuracy, which makes it challenging. Apart from manually designed networks, researchers seek to adopt neural architecture search (NAS) to discover a real-time semantic segmentation model with optimal performance automatically. Most existing NAS methods stack up no more than two types of searched cells, omitting the characteristics of resolution variation. This article proposes the hierarchical shared architecture search (HAS) method to automatically build a real-time semantic segmentation model for remote sensing images. Our model contains a lightweight backbone and a multiscale feature fusion module. The lightweight backbone is carefully designed with low computational cost. The multiscale feature fusion module is searched using the NAS method, where only the blocks from the same layer share identical cells. Extensive experiments reveal that our searched real-time semantic segmentation model of remote sensing images achieves the state-of-the-art trade-off between accuracy and speed. Specifically, on the LoveDA, Potsdam, and Vaihingen datasets, the searched network achieves 54.5% mIoU, 87.8% mIoU, and 84.1% mIoU, respectively, with an inference speed of 132.7 FPS. Besides, our searched network achieves 72.6% mIoU at 164.0 FPS on the CityScapes dataset and 72.3% mIoU at 186.4 FPS on the CamVid dataset.
引用
收藏
页码:18 / 18
页数:1
相关论文
共 81 条
[31]   Multistage Attention ResU-Net for Semantic Segmentation of Fine-Resolution Remote Sensing Images [J].
Li, Rui ;
Zheng, Shunyi ;
Duan, Chenxi ;
Su, Jianlin ;
Zhang, Ce .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[32]   ABCNet: Attentive bilateral contextual network for efficient semantic segmentation of Fine-Resolution remotely sensed imagery [J].
Li, Rui ;
Zheng, Shunyi ;
Zhang, Ce ;
Duan, Chenxi ;
Wang, Libo ;
Atkinson, Peter M. .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2021, 181 :84-98
[33]   SAR-TSCC: A Novel Approach for Long Time Series SAR Image Change Detection and Pattern Analysis [J].
Li, Weisong ;
Ma, Peifeng ;
Wang, Haipeng ;
Fang, Chaoyang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[34]   Graph-guided Architecture Search for Real-time Semantic Segmentation [J].
Lin, Peiwen ;
Sun, Peng ;
Cheng, Guangliang ;
Xie, Sirui ;
Li, Xi ;
Shi, Jianping .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4202-4211
[35]   Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation [J].
Liu, Chenxi ;
Chen, Liang-Chieh ;
Schroff, Florian ;
Adam, Hartwig ;
Hua, Wei ;
Yuille, Alan ;
Li Fei-Fei .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :82-92
[36]  
Liu HaiJing Liu HaiJing, 2018, The Proceedings of the Fifteenth Congress of China Sheep Industry Development Sponsored by the China Animal Husbandry Association in 2018, Henan, China, 10-11 October, 2018, P13
[37]   RELAXNet: Residual efficient learning and attention expected fusion network for real-time semantic segmentation [J].
Liu, Jin ;
Xu, Xiaoqing ;
Shi, Yiqing ;
Deng, Cheng ;
Shi, Miaohua .
NEUROCOMPUTING, 2022, 474 :115-127
[38]   Efficient Patch-Wise Semantic Segmentation for Large-Scale Remote Sensing Images [J].
Liu, Yan ;
Ren, Qirui ;
Geng, Jiahui ;
Ding, Meng ;
Li, Jiangyun .
SENSORS, 2018, 18 (10)
[39]   Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [J].
Liu, Ze ;
Lin, Yutong ;
Cao, Yue ;
Hu, Han ;
Wei, Yixuan ;
Zhang, Zheng ;
Lin, Stephen ;
Guo, Baining .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9992-10002
[40]  
Long J, 2015, PROC CVPR IEEE, P3431, DOI 10.1109/CVPR.2015.7298965