HIGH-FIDELITY LAKE EXTRACTION VIA TWO-STAGE PROMPT ENHANCEMENT: ESTABLISHING A NOVEL BASELINE AND BENCHMARK

被引：0

作者：

Chen, Ben ^{[1
]}

Liu, Xuechao ^{[1
]}

Li, Kai ^{[2
]}

Zhang, Yu ^{[1
]}

Xing, Junliang ^{[2
]}

Tao, Pin ^{[1
,2
]}

机构：

[1] Qinghai Univ, Dept Comp Technol & Applicat, Xining, Peoples R China

[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024 | 2024年

关键词：

Lake Extraction; Semantic Segmentation; Prompt Learning; Vision Transformer;

D O I：

10.1109/ICME57554.2024.10688015

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Lake extraction from remote sensing imagery is a complex challenge due to the varied lake shapes and data noise. Current methods rely on multispectral image datasets, making it challenging to learn lake features accurately from pixel arrangements. This, in turn, affects model learning and the creation of accurate segmentation masks. This paper introduces a prompt-based dataset construction approach that provides approximate lake locations using point, box, and mask prompts. We also propose a two-stage prompt enhancement framework, LEPrompter, with prompt-based and prompt-free stages during training. The prompt-based stage employs a prompt encoder to extract prior information, integrating prompt tokens and image embedding through self- and cross-attention in the prompt decoder. Prompts are deactivated to ensure independence during inference, enabling automated lake extraction without introducing additional parameters and GFlops. Extensive experiments showcase performance improvements of our proposed approach compared to the previous stateof-the-art method. The source code is available at https: //github.com/BastianChen/LEPrompter.

引用

页数：6

共 26 条

[1] Alayrac J-B., 2022, Adv. Neural. Inf. Process. Syst, V35, P23716, DOI DOI 10.48550/ARXIV.2204.14198
[2] WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians
Bernal, Jorge
Javier Sanchez, F.
Fernandez-Esparrach, Gloria
Gil, Debora
Rodriguez, Cristina
Vilarino, Fernando
[J]. COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 : 99 - 111
[3] Brown T. B., 2025, ADV NEURAL INFORM PR, DOI DOI 10.48550/ARXIV.2005.14165
[4] LEFORMER: A HYBRID CNN-TRANSFORMER ARCHITECTURE FOR ACCURATE LAKE EXTRACTION FROM REMOTE SENSING IMAGERY
Chen, Ben
Zou, Xuechao
Zhang, Yu
Li, Jiayu
Li, Kai
Xing, Junliang
Tao, Pin
[J]. 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5710 - 5714
[5] Chen K., 2023, ARXIV
[6] Chollet Francois, 2017, PROC CVPR IEEE, P1251, DOI [DOI 10.1109/CVPR.2017.195, 10.1109/CVPR.2017.195]
[7] Dosovitskiy A., 2021, INT C LEARNING REPRE
[8] Ester M., 1996, P 2 INT C KNOWL DISC, P226
[9] Guo M-H., 2022, Adv Neural Inf Process Syst, V35, P1140
[10] He Kaiming, 2016, COMPUTER VISION PATT, P5, DOI [DOI 10.1109/CVPR.2016.90, 10.1109/CVPR.2016.90]

← 1 2 3 →