Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation

被引：70

作者：

Shi, Xinyu ^{[1
]}

Wei, Dong ^{[2
]}

Zhang, Yu ^{[1
]}

Lu, Donghuan ^{[2
]}

Ning, Munan ^{[2
]}

Chen, Jiashun ^{[1
]}

Ma, Kai ^{[2
]}

Zheng, Yefeng ^{[2
]}

机构：

[1] Southeast Univ, Sch Comp Sci & Engn, Key Lab Comp Network & Informat Integrat, Minist Educ, Nanjing, Peoples R China

[2] Tencent Jarvis Lab, Shenzhen, Peoples R China

来源：

COMPUTER VISION, ECCV 2022, PT XX | 2022年 / 13680卷

基金：

国家重点研发计划;

关键词：

Few-shot segmentation; Dense cross-query-and-support attention; Attention weighted mask aggregation;

D O I：

10.1007/978-3-031-20044-1_9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Research into Few-shot Semantic Segmentation (FSS) has attracted great attention, with the goal to segment target objects in a query image given only a few annotated support images of the target class. A key to this challenging task is to fully utilize the information in the support images by exploiting fine-grained correlations between the query and support images. However, most existing approaches either compressed the support information into a few class-wise prototypes, or used partial support information (e.g., only foreground) at the pixel level, causing non-negligible information loss. In this paper, we propose Dense pixel-wise Cross-query-and-support Attention weighted Mask Aggregation (DCAMA), where both foreground and background support information are fully exploited via multi-level pixel-wise correlations between paired query and support features. Implemented with the scaled dotproduct attention in the Transformer architecture, DCAMA treats every query pixel as a token, computes its similarities with all support pixels, and predicts its segmentation label as an additive aggregation of all the support pixels' labels-weighted by the similarities. Based on the unique formulation of DCAMA, we further propose efficient and effective one-pass inference for n-shot segmentation, where pixels of all support images are collected for the mask aggregation at once. Experiments show that our DCAMA significantly advances the state of the art on standard FSS benchmarks of PASCAL-5(i), COCO-20(i), and FSS-1000, e.g., with 3.1%, 9.7%, and 3.6% absolute improvements in 1-shot mIoU over previous best records. Ablative studies also verify the design DCAMA.

引用

页码：151 / 168

页数：18

共 50 条

[1] Few-Shot Semantic Segmentation via Mask Aggregation
Ao, Wei
Zheng, Shunyi
Meng, Yan
Yang, Yang
NEURAL PROCESSING LETTERS, 2024, 56 (02)
[2] Few-Shot Semantic Segmentation via Mask Aggregation
Wei Ao
Shunyi Zheng
Yan Meng
Yang Yang
Neural Processing Letters, 56
[3] Global-Local Query-Support Cross-Attention for Few-Shot Semantic Segmentation
Xie, Fengxi
Liang, Guozhen
Chien, Ying-Ren
MATHEMATICS, 2024, 12 (18)
[4] Query-support semantic correlation mining for few-shot segmentation
Shao, Ji
Gong, Bo
Dai, Kanyuan
Li, Daoliang
Jing, Ling
Chen, Yingyi
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
[5] CobNet: Cross Attention on Object and Background for Few-Shot Segmentation
Guan, Haoyan
Michael, Spratling
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 39 - 45
[6] Few-Shot Segmentation based on Global-cross Attention
Wang, Cailing
Xu, Yinpeng
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4905 - 4910
[7] Mask Matching Transformer for Few-Shot Segmentation
Jiao, Siyu
Zhang, Gengwei
Navasardyan, Shant
Chen, Ling
Zhao, Yao
Wei, Yunchao
Shi, Humphrey
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[8] MASK-GUIDED ATTENTION AND EPISODE ADAPTIVE WEIGHTS FOR FEW-SHOT SEGMENTATION
Kwon, Hyeongjun
Song, Taeyong
Kim, Sunok
Sohn, Kwanghoon
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2611 - 2615
[9] Cross-Domain Few-Shot Segmentation via Iterative Support-Query Correspondence Mining
Nie, Jiahao
Xing, Yun
Zhang, Gongjie
Yan, Pei
Xiao, Aoran
Tan, Yap-Peng
Kot, Alex C.
Lu, Shijian
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 3380 - 3390
[10] Dense affinity matching for Few-Shot Segmentation
Chen, Hao
Dong, Yonghan
Lu, Zheming
Yu, Yunlong
Li, Yingming
Han, Jungong
Zhang, Zhongfei
NEUROCOMPUTING, 2024, 577

← 1 2 3 4 5 →