RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation Based on Visual Foundation Model

被引:162
作者
Chen, Keyan [1 ,2 ,3 ]
Liu, Chenyang [1 ,2 ,3 ]
Chen, Hao [3 ]
Zhang, Haotian [1 ,2 ,3 ]
Li, Wenyuan [4 ]
Zou, Zhengxia [3 ,5 ]
Shi, Zhenwei [1 ,2 ,3 ]
机构
[1] Beihang Univ, Image Proc Ctr, Sch Astronaut, Beijing 100191, Peoples R China
[2] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
[3] Shanghai Artificial Intelligence Lab, Shanghai 200232, Peoples R China
[4] Univ Hong Kong, Dept Geog, Hong Kong, Peoples R China
[5] Beihang Univ, Sch Astronaut, Dept Guidance Nav & Control, Beijing 100191, Peoples R China
来源
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2024年 / 62卷
关键词
Foundation model; instance segmentation; prompt learning; remote sensing images; segment anything model (SAM); OBJECT DETECTION;
D O I
10.1109/TGRS.2024.3356074
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Leveraging the extensive training data from SA-1B, the segment anything model (SAM) demonstrates remarkable generalization and zero-shot capabilities. However, as a category-agnostic instance segmentation method, SAM heavily relies on prior manual guidance, including points, boxes, and coarse-grained masks. Furthermore, its performance in remote sensing image segmentation tasks remains largely unexplored and unproven. In this article, we aim to develop an automated instance segmentation approach for remote sensing images based on the foundational SAM model and incorporating semantic category information. Drawing inspiration from prompt learning, we propose a method to learn the generation of appropriate prompts for SAM. This enables SAM to produce semantically discernible segmentation results for remote sensing images, a concept that we have termed RSPrompter. We also propose several ongoing derivatives for instance segmentation tasks, drawing on recent advancements within the SAM community, and compare their performance with RSPrompter. Extensive experimental results, derived from the WHU building dataset, the NWPU VHR-10 dataset, and the SAR Ship Detection Dataset (SSDD) dataset, validate the effectiveness of our proposed method. The code for our method is publicly available at https://kychen.me/RSPrompter.
引用
收藏
页码:1 / 17
页数:17
相关论文
共 98 条
[1]  
2023, Arxiv, DOI arXiv:2303.08774
[2]  
Alayrac JB, 2022, ADV NEUR IN
[3]   YOLACT Real-time Instance Segmentation [J].
Bolya, Daniel ;
Zhou, Chong ;
Xiao, Fanyi ;
Lee, Yong Jae .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :9156-9165
[4]  
Brown TB, 2020, ADV NEUR IN, V33
[5]   Cascade R-CNN: High Quality Object Detection and Instance Segmentation [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1483-1498
[6]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[7]  
Cen JZ, 2024, Arxiv, DOI arXiv:2304.12308
[8]  
Chen H, 2023, Arxiv, DOI arXiv:2305.14722
[9]   BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation [J].
Chen, Hao ;
Sun, Kunyang ;
Tian, Zhi ;
Shen, Chunhua ;
Huang, Yongming ;
Yan, Youliang .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8570-8578
[10]   Contrastive Learning for Fine-Grained Ship Classification in Remote Sensing Images [J].
Chen, Jianqi ;
Chen, Keyan ;
Chen, Hao ;
Li, Wenyuan ;
Zou, Zhengxia ;
Shi, Zhenwei .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60