Dual Branch Multi-Level Semantic Learning for Few-Shot Segmentation

被引:19
作者
Chen, Yadang [1 ,2 ]
Jiang, Ren [1 ,2 ]
Zheng, Yuhui [3 ,4 ]
Sheng, Bin [5 ]
Yang, Zhi-Xin [6 ]
Wu, Enhua [7 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing 210044, Peoples R China
[2] Minist Educ, Engn Res Ctr Digital Forens, Nanjing 210044, Peoples R China
[3] Qinghai Normal Univ, Coll Comp, Xining 810016, Peoples R China
[4] Nanjing Univ Informat Sci & Technol, Sch Comp Sci, Nanjing 210044, Peoples R China
[5] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[6] Univ Macau, Dept Electromech Engn, State Key Lab Internet Things Smart City, Macau, Peoples R China
[7] Chinese Acad Sci, State Key Lab Comp Sci, Inst Software, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Prototypes; Training; Semantics; Semantic segmentation; Self-supervised learning; Feature extraction; Measurement; Few-shot learning; semantic segmentation; contrastive learning; metric learning; NETWORK;
D O I
10.1109/TIP.2024.3364056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few -shot semantic segmentation aims to segment novel -class objects in a query image with only a few annotated examples in support images. Although progress has been made recently by combining prototype -based metric learning, existing methods still face two main challenges. First, various intra-class objects between the support and query images or semantically similar inter -class objects can seriously harm the segmentation performance due to their poor feature representations. Second, the latent novel classes are treated as the background in most methods, leading to a learning bias, whereby these novel classes are difficult to correctly segment as foreground. To solve these problems, we propose a dual -branch learning method. The class -specific branch encourages representations of objects to be more distinguishable by increasing the inter -class distance while decreasing the intra-class distance. In parallel, the class -agnostic branch focuses on minimizing the foreground class feature distribution and maximizing the features between the foreground and background, thus increasing the generalizability to novel classes in the test stage. Furthermore, to obtain more representative features, pixel -level and prototype -level semantic learning are both involved in the two branches. The method is evaluated on PASCAL -5(i) 1 -shot, PASCAL -5(i) 5 -shot, COCO-20(i) 1 -shot, and COCO-20(i) 5 -shot, and extensive experiments show that our approach is effective for few -shot semantic segmentation despite its simplicity.
引用
收藏
页码:1432 / 1447
页数:16
相关论文
共 49 条
[1]  
An YX, 2021, PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, P2140
[2]  
Chen T., 2020, PMLR, ppp 1597
[3]  
Chen XL, 2020, Arxiv, DOI [arXiv:2003.04297, 10.48550/arXiv.2003.04297]
[4]   Fast target-aware learning for few-shot video object segmentation [J].
Chen, Yadang ;
Hao, Chuanyan ;
Yang, Zhi-Xin ;
Wu, Enhua .
SCIENCE CHINA-INFORMATION SCIENCES, 2022, 65 (08)
[5]   A Two-Stage Approach to Few-Shot Learning for Image Recognition [J].
Das, Debasmit ;
Lee, C. S. George .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :3336-3350
[6]  
Dong N., 2018, BMVC, V3, P4
[7]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[8]   Self-support Few-Shot Semantic Segmentation [J].
Fan, Qi ;
Pei, Wenjie ;
Tai, Yu-Wing ;
Tang, Chi-Keung .
COMPUTER VISION, ECCV 2022, PT XIX, 2022, 13679 :701-719
[9]  
Finn C, 2017, PR MACH LEARN RES, V70
[10]   Learning Robust Discriminant Subspace Based on Joint L2, p- and L2,s-Norm Distance Metrics [J].
Fu, Liyong ;
Li, Zechao ;
Ye, Qiaolin ;
Yin, Hang ;
Liu, Qingwang ;
Chen, Xiaobo ;
Fan, Xijian ;
Yang, Wankou ;
Yang, Guowei .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (01) :130-144