Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

被引:28
作者
Ren, Wenqi [1 ]
Tang, Yang [1 ]
Sun, Qiyu [1 ]
Zhao, Chaoqiang [2 ]
Han, Qing-Long [3 ]
机构
[1] East China Univ Sci & Technol, Key Lab Smart Mfg Energy Chem Proc, Minist Educ, Shanghai 200237, Peoples R China
[2] Aviat Ind Corp China, Natl Key Lab Air Based Informat Percept & Fus, Luoyang 471000, Peoples R China
[3] Swinburne Univ Technol, Sch Sci Comp & Engn Technol, Melbourne, Vic 3122, Australia
关键词
Visualization; Three-dimensional displays; Semantic segmentation; Task analysis; Semantics; Annotations; Training; Computer vision; deep learning; few-shot learning; low-shot learning; semantic segmentation; zero-shot learning; VIDEO OBJECT SEGMENTATION; NETWORK; AGGREGATION; ATTENTION;
D O I
10.1109/JAS.2023.123207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block, and it plays a crucial role in environmental perception. Conventional learning-based visual semantic segmentation approaches count heavily on large-scale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories. This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning. The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen categories from a few labeled or even zero-labeled samples, which advances the extension to practical applications. Therefore, this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances. Specifically, the preliminaries on few/zero-shot visual semantic segmentation, including the problem definitions, typical datasets, and technical remedies, are briefly reviewed and discussed. Moreover, three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation, including image semantic segmentation, video object segmentation, and 3D segmentation. Finally, the future challenges of few/zero-shot visual semantic segmentation are discussed.
引用
收藏
页码:1106 / 1126
页数:21
相关论文
共 163 条
[1]  
Amac A., 2022, P IEEE CVF WINT C AP, P156
[2]   Few-Shot Object Detection: A Survey [J].
Antonelli, Simone ;
Avola, Danilo ;
Cinque, Luigi ;
Crisostomi, Donato ;
Foresti, Gian Luca ;
Galasso, Fabio ;
Marini, Marco Raoul ;
Mecca, Alessio ;
Pannone, Daniele .
ACM COMPUTING SURVEYS, 2022, 54 (11S)
[3]   3D Semantic Parsing of Large-Scale Indoor Spaces [J].
Armeni, Iro ;
Sener, Ozan ;
Zamir, Amir R. ;
Jiang, Helen ;
Brilakis, Ioannis ;
Fischer, Martin ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543
[4]   Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation [J].
Baek, Donghyeon ;
Oh, Youngmin ;
Ham, Bumsub .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9516-9525
[5]  
Bellver M, 2020, Arxiv, DOI arXiv:2010.00263
[6]   Generating Bounding Box Supervision for Semantic Segmentation with Deep Learning [J].
Bonechi, Simone ;
Andreini, Paolo ;
Bianchini, Monica ;
Scarselli, Franco .
ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, ANNPR 2018, 2018, 11081 :190-200
[7]   End-to-End Referring Video Object Segmentation with Multimodal Transformers [J].
Botach, Adam ;
Zheltonozhskii, Evgenii ;
Baskin, Chaim .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :4975-4985
[8]   Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need? [J].
Boudiaf, Malik ;
Kervadec, Hoel ;
Masud, Ziko Imtiaz ;
Piantanida, Pablo ;
Ben Ayed, Ismail ;
Dolz, Jose .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13974-13983
[9]  
Bucher Maxime, 2019, P NEURIPS, V32, P468
[10]  
Chen K., 2013, Computer Science, P1