Hierarchical Shared Architecture Search for Real-Time Semantic Segmentation of Remote Sensing Images

被引:3
作者
Wang, Wenna [1 ,2 ]
Ran, Lingyan [1 ,2 ]
Yin, Hanlin [1 ,2 ]
Sun, Mingjun [1 ,2 ]
Zhang, Xiuwei [1 ,2 ]
Zhang, Yanning [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Shaanxi Prov Key Lab Speech & Image Informat Proc, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710072, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
基金
中国国家自然科学基金;
关键词
Feature aggregation module; hierarchical shared search strategy; neural architecture search (NAS); real-time semantic segmentation; NETWORK;
D O I
10.1109/TGRS.2024.3373493
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Real-time semantic segmentation of remote-sensing images demands a trade-off between speed and accuracy, which makes it challenging. Apart from manually designed networks, researchers seek to adopt neural architecture search (NAS) to discover a real-time semantic segmentation model with optimal performance automatically. Most existing NAS methods stack up no more than two types of searched cells, omitting the characteristics of resolution variation. This article proposes the hierarchical shared architecture search (HAS) method to automatically build a real-time semantic segmentation model for remote sensing images. Our model contains a lightweight backbone and a multiscale feature fusion module. The lightweight backbone is carefully designed with low computational cost. The multiscale feature fusion module is searched using the NAS method, where only the blocks from the same layer share identical cells. Extensive experiments reveal that our searched real-time semantic segmentation model of remote sensing images achieves the state-of-the-art trade-off between accuracy and speed. Specifically, on the LoveDA, Potsdam, and Vaihingen datasets, the searched network achieves 54.5% mIoU, 87.8% mIoU, and 84.1% mIoU, respectively, with an inference speed of 132.7 FPS. Besides, our searched network achieves 72.6% mIoU at 164.0 FPS on the CityScapes dataset and 72.3% mIoU at 186.4 FPS on the CamVid dataset.
引用
收藏
页码:18 / 18
页数:1
相关论文
共 81 条
[1]  
[Anonymous], 2018, PROC EUR C COMPUT VI, DOI [DOI 10.1007/978-3-030-01234-2_49, 10.1007/978-3-030-01234-2_49]
[2]   RGPNet: A Real-Time General Purpose Semantic Segmentation [J].
Arani, Elahe ;
Marzban, Shabbir ;
Pata, Andrei ;
Zonooz, Bahram .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, :3008-3017
[3]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[4]   BASNet: Burned Area Segmentation Network for Real-Time Detection of Damage Maps in Remote Sensing Images [J].
Bo, Weihao ;
Liu, Jie ;
Fan, Xijian ;
Tjahjadi, Tardi ;
Ye, Qiaolin ;
Fu, Liyong .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[5]   Segmentation and Recognition Using Structure from Motion Point Clouds [J].
Brostow, Gabriel J. ;
Shotton, Jamie ;
Fauqueur, Julien ;
Cipolla, Roberto .
COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 :44-+
[6]  
Cai H., 2018, P INT C LEARN REPR I, P1
[7]  
Cai H, 2018, AAAI CONF ARTIF INTE, P2787
[8]  
Chen J., 2021, arXiv
[9]  
Chen LC, 2018, ADV NEUR IN, V31
[10]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848