Context Enhancing Representation for Semantic Segmentation in Remote Sensing Images

被引:19
作者
Fang, Leyuan [1 ,2 ]
Zhou, Peng [1 ,3 ]
Liu, Xinxin [1 ,3 ]
Ghamisi, Pedram [4 ,5 ]
Chen, Siwei [6 ]
机构
[1] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Hunan, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518000, Peoples R China
[3] Hunan Univ, Key Lab Visual Percept & Artificial Intelligence, Changsha 410082, Hunan, Peoples R China
[4] Helmholtz Inst Freiberg Resource Technol, Helmholtz Zentrum Dresden Rossendorf HZDR, D-09599 Freiberg, Germany
[5] Inst Adv Res Artificial Intelligence IARAI, A-1030 Vienna, Austria
[6] Natl Univ Def Technol, State Key Lab Complex Electromagnet Environm Effe, Changsha 410073, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantics; Context modeling; Image segmentation; Convolution; Decoding; Remote sensing; Predictive models; deep learning; feature alignment and enhancement; remote sensing images (RSIs); semantic segmentation;
D O I
10.1109/TNNLS.2022.3201820
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the foundation of image interpretation, semantic segmentation is an active topic in the field of remote sensing. Facing the complex combination of multiscale objects existing in remote sensing images (RSIs), the exploration and modeling of contextual information have become the key to accurately identifying the objects at different scales. Although several methods have been proposed in the past decade, insufficient context modeling of global or local information, which easily results in the fragmentation of large-scale objects, the ignorance of small-scale objects, and blurred boundaries. To address the above issues, we propose a contextual representation enhancement network (CRENet) to strengthen the global context (GC) and local context (LC) modeling in high-level features. The core components of the CRENet are the local feature alignment enhancement module (LFAEM) and the superpixel affinity loss (SAL). The LFAEM aligns and enhances the LC in low-level features by constructing contextual contrast through multilayer cascaded deformable convolution and is then supplemented with high-level features to refine the segmentation map. The SAL assists the network to accurately capture the GC by supervising semantic information and relationship learned from superpixels. The proposed method is plug-and-play and can be embedded in any FCN-based network. Experiments on two popular RSI datasets demonstrate the effectiveness of our proposed network with competitive performance in qualitative and quantitative aspects.
引用
收藏
页码:4138 / 4152
页数:15
相关论文
共 57 条
[21]   Global-Guided Selective Context Network for Scene Parsing [J].
Jiang, Jie ;
Liu, Jing ;
Fu, Jun ;
Zhu, Xinxin ;
Li, Zechao ;
Lu, Hanqing .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) :1752-1764
[22]   Multitask Semantic Boundary Awareness Network for Remote Sensing Image Segmentation [J].
Li, Aijin ;
Jiao, Licheng ;
Zhu, Hao ;
Li, Lingling ;
Liu, Fang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[23]   Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [J].
Li, Xia ;
Yang, Yibo ;
Zhao, Qijie ;
Shen, Tiancheng ;
Lin, Zhouchen ;
Liu, Hong .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8947-8956
[24]   Expectation-Maximization Attention Networks for Semantic Segmentation [J].
Li, Xia ;
Zhong, Zhisheng ;
Wu, Jianlong ;
Yang, Yibo ;
Lin, Zhouchen ;
Liu, Hong .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9166-9175
[25]   Semantic Flow for Fast and Accurate Scene Parsing [J].
Li, Xiangtai ;
You, Ansheng ;
Zhu, Zhen ;
Zhao, Houlong ;
Yang, Maoke ;
Yang, Kuiyuan ;
Tan, Shaohua ;
Tong, Yunhai .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :775-793
[26]   PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation [J].
Li, Xiangtai ;
He, Hao ;
Li, Xia ;
Li, Duo ;
Cheng, Guangliang ;
Shi, Jianping ;
Weng, Lubin ;
Tong, Yunhai ;
Lin, Zhouchen .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :4215-4224
[27]   Improving Semantic Segmentation via Decoupled Body and Edge Supervision [J].
Li, Xiangtai ;
Li, Xia ;
Zhang, Li ;
Cheng, Guangliang ;
Shi, Jianping ;
Lin, Zhouchen ;
Tan, Shaohua ;
Tong, Yunhai .
COMPUTER VISION - ECCV 2020, PT XVII, 2020, 12362 :435-452
[28]  
Lin T-Y, 2017, P IEEE C COMP VIS PA, P2117
[29]   Learning to Predict Context-Adaptive Convolution for Semantic Segmentation [J].
Liu, Jianbo ;
He, Junjun ;
Qiao, Yu ;
Ren, Jimmy S. ;
Li, Hongsheng .
COMPUTER VISION - ECCV 2020, PT XXV, 2020, 12370 :769-786
[30]   Semantic labeling in very high resolution images via a self-cascaded convolutional neural network [J].
Liu, Yongcheng ;
Fan, Bin ;
Wang, Lingfeng ;
Bai, Jun ;
Xiang, Shiming ;
Pan, Chunhong .
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 145 :78-95