Bidirectional mutual guidance transformer for salient object detection in optical remote sensing images

被引:6
作者
Huang, Kan [1 ]
Tian, Chunwei [2 ]
Li, Ge [3 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai, Peoples R China
[2] Northwestern Polytech Univ, Sch Software, Xian, Peoples R China
[3] Peking Univ, Sch Elect & Comp Engn, Shenzhen, Peoples R China
基金
中国博士后科学基金; 美国国家科学基金会;
关键词
Salient object detection; optical remote sensing images; Transformer; NETWORK;
D O I
10.1080/01431161.2023.2229494
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Salient object detection in optical remote sensing images presents great challenges due to the characteristics of remote sensing images such as cluttered background, varying object scales, and unstable imaging conditions, etc. In this paper, we present a Bidirectional Mutual Guidance Transformer (BMGT), which mitigates the locality issue of CNN-based models, and exploits the mutual guidance between global context-aware object representations and fine-grained boundary structures. It contains a hierarchically structured Transformer encoder that extracts multi-level multi-scale token representations, and a dual-stream cross-task MLP decoder that performs joint salient object detection and salient boundary detection in an end-to-end manner. In particular, the dual-stream decoder consists of two sub-branch networks with symmetric architectures, that are connected by a newly proposed Mutual Guidance MLP layer (MG-MLP). Through MG-MLP, salient object features and salient boundary features interact with each other, facilitating complementary learning at multiple network levels. Extensive evaluations demonstrate that our proposed method outperforms other existing methods in two public remote sensing image benchmarks. It proves that our BMGT is advantageous in exploiting long-range context dependencies as well as preserving fine-grained boundary structures.
引用
收藏
页码:4016 / 4033
页数:18
相关论文
共 52 条
[1]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2]  
Chang KY, 2011, IEEE I CONF COMP VIS, P914, DOI 10.1109/ICCV.2011.6126333
[3]   Global Contrast Based Salient Region Detection [J].
Cheng, Ming-Ming ;
Mitra, Niloy J. ;
Huang, Xiaolei ;
Torr, Philip H. S. ;
Hu, Shi-Min .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (03) :569-582
[4]   Global-and-Local Collaborative Learning for Co-Salient Object Detection [J].
Cong, Runmin ;
Yang, Ning ;
Li, Chongyi ;
Fu, Huazhu ;
Zhao, Yao ;
Huang, Qingming ;
Kwong, Sam .
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (03) :1920-1931
[5]   RRNet: Relational Reasoning Network With Parallel Multiscale Attention for Salient Object Detection in Optical Remote Sensing Images [J].
Cong, Runmin ;
Zhang, Yumo ;
Fang, Leyuan ;
Li, Jun ;
Zhao, Yao ;
Kwong, Sam .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[6]  
Deng ZJ, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P684
[7]   Efficient Saliency-Based Object Detection in Remote Sensing Images Using Deep Belief Networks [J].
Diao, Wenhui ;
Sun, Xian ;
Zheng, Xinwei ;
Dou, Fangzheng ;
Wang, Hongqi ;
Fu, Kun .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2016, 13 (02) :137-141
[8]  
Dosovitskiy Alexey, 2021, P ICLR
[9]   Structure-measure: A New Way to Evaluate Foreground Maps [J].
Fan, Deng-Ping ;
Cheng, Ming-Ming ;
Liu, Yun ;
Li, Tao ;
Borji, Ali .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567
[10]  
Fan DP, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P698