Hierarchical Dense Correlation Distillation for Few-Shot Segmentation

被引：67

作者：

Peng, Bohao ^{[1
]}

Tian, Zhuotao ^{[4
]}

Wu, Xiaoyang ^{[2
]}

Wang, Chengyao ^{[1
]}

Liu, Shu ^{[4
]}

Su, Jingyong ^{[3
]}

Jia, Jiaya ^{[1
,4
]}

机构：

[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China

[2] Univ Hong Kong, Hong Kong, Peoples R China

[3] Harbin Inst Technol, Shenzhen, Peoples R China

[4] SmartMore, Shenzhen, Peoples R China

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

NETWORK;

D O I：

10.1109/CVPR52729.2023.02264

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations. Previous methods limited to the semantic feature and prototype representation suffer from coarse segmentation granularity and train-set overfitting. In this work, we design Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support correlation based on the transformer architecture. The selfattention modules are used to assist in establishing hierarchical dense features, as a means to accomplish the cascade matching between query and support features. Moreover, we propose a matching module to reduce train-set overfitting and introduce correlation distillation leveraging semantic correspondence from coarse resolution to boost fine-grained segmentation. Our method performs decently in experiments. We achieve 50.0% mIoU on COCO-20(i) dataset one-shot setting and 56.0% on five-shot segmentation, respectively. The code is available on the project website.

引用

页码：23641 / 23651

页数：11

共 53 条

[1]

[Anonymous], P IEEE CVF INT C COM

[2] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need? [J].

Boudiaf, Malik ;

Kervadec, Hoel ;

Masud, Ziko Imtiaz ;

Piantanida, Pablo ;

Ben Ayed, Ismail ;

Dolz, Jose .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13974-13983

[3] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[4]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[5]

Dong N, 2018, BMVC

[6]

Dosovitskiy A., 2020, ICLR 2021

[7] The Pascal Visual Object Classes (VOC) Challenge [J].

Everingham, Mark ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338

[8]

Fan Qi, 2022, ARXIV220711549

[9] FGN: Fully Guided Network for Few-Shot Instance Segmentation [J].

Fan, Zhibo ;

Yu, Jin-Gang ;

Liang, Zhihao ;

Ou, Jiarong ;

Gao, Changxin ;

Xia, Gui-Song ;

Li, Yuanqing .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9169-9178

[10]

Gairola S, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P573

← 1 2 3 4 5 6 →