HIGH-FIDELITY LAKE EXTRACTION VIA TWO-STAGE PROMPT ENHANCEMENT: ESTABLISHING A NOVEL BASELINE AND BENCHMARK

被引:0
作者
Chen, Ben [1 ]
Liu, Xuechao [1 ]
Li, Kai [2 ]
Zhang, Yu [1 ]
Xing, Junliang [2 ]
Tao, Pin [1 ,2 ]
机构
[1] Qinghai Univ, Dept Comp Technol & Applicat, Xining, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
来源
2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024 | 2024年
关键词
Lake Extraction; Semantic Segmentation; Prompt Learning; Vision Transformer;
D O I
10.1109/ICME57554.2024.10688015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Lake extraction from remote sensing imagery is a complex challenge due to the varied lake shapes and data noise. Current methods rely on multispectral image datasets, making it challenging to learn lake features accurately from pixel arrangements. This, in turn, affects model learning and the creation of accurate segmentation masks. This paper introduces a prompt-based dataset construction approach that provides approximate lake locations using point, box, and mask prompts. We also propose a two-stage prompt enhancement framework, LEPrompter, with prompt-based and prompt-free stages during training. The prompt-based stage employs a prompt encoder to extract prior information, integrating prompt tokens and image embedding through self- and cross-attention in the prompt decoder. Prompts are deactivated to ensure independence during inference, enabling automated lake extraction without introducing additional parameters and GFlops. Extensive experiments showcase performance improvements of our proposed approach compared to the previous stateof-the-art method. The source code is available at https: //github.com/BastianChen/LEPrompter.
引用
收藏
页数:6
相关论文
共 26 条
  • [1] Alayrac J-B., 2022, Adv. Neural. Inf. Process. Syst, V35, P23716, DOI DOI 10.48550/ARXIV.2204.14198
  • [2] WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians
    Bernal, Jorge
    Javier Sanchez, F.
    Fernandez-Esparrach, Gloria
    Gil, Debora
    Rodriguez, Cristina
    Vilarino, Fernando
    [J]. COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 : 99 - 111
  • [3] Brown T. B., 2025, ADV NEURAL INFORM PR, DOI DOI 10.48550/ARXIV.2005.14165
  • [4] LEFORMER: A HYBRID CNN-TRANSFORMER ARCHITECTURE FOR ACCURATE LAKE EXTRACTION FROM REMOTE SENSING IMAGERY
    Chen, Ben
    Zou, Xuechao
    Zhang, Yu
    Li, Jiayu
    Li, Kai
    Xing, Junliang
    Tao, Pin
    [J]. 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5710 - 5714
  • [5] Chen K., 2023, ARXIV
  • [6] Chollet Francois, 2017, PROC CVPR IEEE, P1251, DOI [DOI 10.1109/CVPR.2017.195, 10.1109/CVPR.2017.195]
  • [7] Dosovitskiy A., 2021, INT C LEARNING REPRE
  • [8] Ester M., 1996, P 2 INT C KNOWL DISC, P226
  • [9] Guo M-H., 2022, Adv Neural Inf Process Syst, V35, P1140
  • [10] He Kaiming, 2016, COMPUTER VISION PATT, P5, DOI [DOI 10.1109/CVPR.2016.90, 10.1109/CVPR.2016.90]