Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

被引:28
作者
Ren, Wenqi [1 ]
Tang, Yang [1 ]
Sun, Qiyu [1 ]
Zhao, Chaoqiang [2 ]
Han, Qing-Long [3 ]
机构
[1] East China Univ Sci & Technol, Key Lab Smart Mfg Energy Chem Proc, Minist Educ, Shanghai 200237, Peoples R China
[2] Aviat Ind Corp China, Natl Key Lab Air Based Informat Percept & Fus, Luoyang 471000, Peoples R China
[3] Swinburne Univ Technol, Sch Sci Comp & Engn Technol, Melbourne, Vic 3122, Australia
关键词
Visualization; Three-dimensional displays; Semantic segmentation; Task analysis; Semantics; Annotations; Training; Computer vision; deep learning; few-shot learning; low-shot learning; semantic segmentation; zero-shot learning; VIDEO OBJECT SEGMENTATION; NETWORK; AGGREGATION; ATTENTION;
D O I
10.1109/JAS.2023.123207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block, and it plays a crucial role in environmental perception. Conventional learning-based visual semantic segmentation approaches count heavily on large-scale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories. This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning. The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen categories from a few labeled or even zero-labeled samples, which advances the extension to practical applications. Therefore, this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances. Specifically, the preliminaries on few/zero-shot visual semantic segmentation, including the problem definitions, typical datasets, and technical remedies, are briefly reviewed and discussed. Moreover, three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation, including image semantic segmentation, video object segmentation, and 3D segmentation. Finally, the future challenges of few/zero-shot visual semantic segmentation are discussed.
引用
收藏
页码:1106 / 1126
页数:21
相关论文
共 163 条
[21]   The Pascal Visual Object Classes (VOC) Challenge [J].
Everingham, Mark ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338
[22]   A Point Set Generation Network for 3D Object Reconstruction from a Single Image [J].
Fan, Haoqiang ;
Su, Hao ;
Guibas, Leonidas .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2463-2471
[23]  
Finn C, 2017, PR MACH LEARN RES, V70
[24]   A Mutually Supervised Graph Attention Network for Few-Shot Segmentation: The Perspective of Fully Utilizing Limited Samples [J].
Gao, Honghao ;
Xiao, Junsheng ;
Yin, Yuyu ;
Liu, Tong ;
Shi, Jiangang .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) :4826-4838
[25]   Mask Selection and Propagation for Unsupervised Video Object Segmentation [J].
Garg, Shubhika ;
Goel, Vidit .
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, :1679-1689
[26]  
Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1
[27]  
Gu YC, 2020, AAAI CONF ARTIF INTE, V34, P10869
[28]   Context-aware Feature Generation for Zero-shot Semantic Segmentation [J].
Gu, Zhangxuan ;
Zhou, Siyuan ;
Niu, Li ;
Zhao, Zihan ;
Zhang, Liqing .
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, :1921-1929
[29]   From Pixel to Patch: Synthesize Context-Aware Features for Zero-Shot Semantic Segmentation [J].
Gu, Zhangxuan ;
Zhou, Siyuan ;
Niu, Li ;
Zhao, Zihan ;
Zhang, Liqing .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) :7689-7703
[30]   Deep Learning for 3D Point Clouds: A Survey [J].
Guo, Yulan ;
Wang, Hanyun ;
Hu, Qingyong ;
Liu, Hao ;
Liu, Li ;
Bennamoun, Mohammed .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) :4338-4364