Hierarchical Dense Correlation Distillation for Few-Shot Segmentation

被引:67
作者
Peng, Bohao [1 ]
Tian, Zhuotao [4 ]
Wu, Xiaoyang [2 ]
Wang, Chengyao [1 ]
Liu, Shu [4 ]
Su, Jingyong [3 ]
Jia, Jiaya [1 ,4 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[2] Univ Hong Kong, Hong Kong, Peoples R China
[3] Harbin Inst Technol, Shenzhen, Peoples R China
[4] SmartMore, Shenzhen, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
NETWORK;
D O I
10.1109/CVPR52729.2023.02264
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations. Previous methods limited to the semantic feature and prototype representation suffer from coarse segmentation granularity and train-set overfitting. In this work, we design Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support correlation based on the transformer architecture. The selfattention modules are used to assist in establishing hierarchical dense features, as a means to accomplish the cascade matching between query and support features. Moreover, we propose a matching module to reduce train-set overfitting and introduce correlation distillation leveraging semantic correspondence from coarse resolution to boost fine-grained segmentation. Our method performs decently in experiments. We achieve 50.0% mIoU on COCO-20(i) dataset one-shot setting and 56.0% on five-shot segmentation, respectively. The code is available on the project website.
引用
收藏
页码:23641 / 23651
页数:11
相关论文
共 53 条
[1]  
[Anonymous], P IEEE CVF INT C COM
[2]   Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need? [J].
Boudiaf, Malik ;
Kervadec, Hoel ;
Masud, Ziko Imtiaz ;
Piantanida, Pablo ;
Ben Ayed, Ismail ;
Dolz, Jose .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13974-13983
[3]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[4]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[5]  
Dong N, 2018, BMVC
[6]  
Dosovitskiy A., 2020, ICLR 2021
[7]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[8]  
Fan Qi, 2022, ARXIV220711549
[9]   FGN: Fully Guided Network for Few-Shot Instance Segmentation [J].
Fan, Zhibo ;
Yu, Jin-Gang ;
Liang, Zhihao ;
Ou, Jiarong ;
Gao, Changxin ;
Xia, Gui-Song ;
Li, Yuanqing .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :9169-9178
[10]  
Gairola S, 2020, PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P573