Transcending Pixels: Boosting Saliency Detection via Scene Understanding From Aerial Imagery

被引:47
作者
Liu, Yanfeng [1 ,2 ]
Xiong, Zhitong [3 ]
Yuan, Yuan [2 ]
Wang, Qi [2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China
[2] Northwestern Polytech Univ, Sch Artificial Intelligence Opt & Elect iOPEN, Xian 710072, Peoples R China
[3] Tech Univ Munich TUM, Chair Data Sci Earth Observat, D-80333 Munich, Germany
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2023年 / 61卷
基金
中国国家自然科学基金;
关键词
Feature extraction; Task analysis; Remote sensing; Saliency detection; Context modeling; Superresolution; Sports; Conditional guidance learning; dynamic class activation map (CAM); optical remote sensing image (RSI); salient object detection (SOD); scene knowledge distillation; OBJECT DETECTION; ATTENTION; CLASSIFICATION; NETWORK;
D O I
10.1109/TGRS.2023.3298661
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Existing remote sensing image salient object detection (RSI-SOD) methods widely perform object-level semantic understanding with pixel-level supervision, but ignore the image-level scene information. As a fundamental attribute of remote sensing images (RSIs), the scene has a complex intrinsic correlation with salient objects, which may bring hints to improve saliency detection performance. However, existing RSI-SOD datasets lack both pixel- and image-level labels, and it is non-trivial to effectively transfer the scene domain knowledge for more accurate saliency localization. To address these challenges, we first annotate the image-level scene labels of three RSI-SOD datasets inspired by remote sensing scene classification. On top of it, we present a novel scene-guided dual-branch network (SDNet), which can perform cross-task knowledge distillation from the scene classification to facilitate accurate saliency detection. Specifically, a scene knowledge transfer module (SKTM) and a conditional dynamic guidance module (CDGM) are designed for extracting saliency key area as spatial attention from the scene subnet and guiding the saliency subnet to generate scene-enhanced saliency features, respectively. Finally, an object contour awareness module (OCAM) is introduced to enable the model to focus more on irregular spatial details of salient objects from the complicated background. Extensive experiments reveal that our SDNet outperforms over 20 state-of-the-art algorithms on three datasets. Moreover, we prove that the proposed framework is model-agnostic, and its extension to six baselines can bring significant performance benefits.
引用
收藏
页数:16
相关论文
共 76 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]   THE LAPLACIAN PYRAMID AS A COMPACT IMAGE CODE [J].
BURT, PJ ;
ADELSON, EH .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1983, 31 (04) :532-540
[3]   Object Detection in Remote Sensing Images Based on a Scene-Contextual Feature Pyramid Network [J].
Chen, Chaoyue ;
Gong, Weiguo ;
Chen, Yongliang ;
Li, Weihong .
REMOTE SENSING, 2019, 11 (03)
[4]   Reverse Attention for Salient Object Detection [J].
Chen, Shuhan ;
Tan, Xiuli ;
Wang, Ben ;
Hu, Xuelong .
COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 :236-252
[5]   Dynamic Convolution: Attention over Convolution Kernels [J].
Chen, Yinpeng ;
Dai, Xiyang ;
Liu, Mengchen ;
Chen, Dongdong ;
Yuan, Lu ;
Liu, Zicheng .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11027-11036
[6]   Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation [J].
Chen, Zhaozheng ;
Wang, Tan ;
Wu, Xiongwei ;
Hua, Xian-Sheng ;
Zhang, Hanwang ;
Sun, Qianru .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :959-968
[7]   Contour-Aware Loss: Boundary-Aware Learning for Salient Object Segmentation [J].
Chen, Zixuan ;
Zhou, Huajun ;
Lai, Jianhuang ;
Yang, Lingxiao ;
Xie, Xiaohua .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :431-443
[8]   Remote Sensing Image Scene Classification: Benchmark and State of the Art [J].
Cheng, Gong ;
Han, Junwei ;
Lu, Xiaoqiang .
PROCEEDINGS OF THE IEEE, 2017, 105 (10) :1865-1883
[9]   RRNet: Relational Reasoning Network With Parallel Multiscale Attention for Salient Object Detection in Optical Remote Sensing Images [J].
Cong, Runmin ;
Zhang, Yumo ;
Fang, Leyuan ;
Li, Jun ;
Zhao, Yao ;
Kwong, Sam .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[10]   Learning to Predict Crisp Boundaries [J].
Deng, Ruoxi ;
Shen, Chunhua ;
Liu, Shengjun ;
Wang, Huibing ;
Liu, Xinru .
COMPUTER VISION - ECCV 2018, PT VI, 2018, 11210 :570-586