Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

被引：28

作者：

Ren, Wenqi ^{[1
]}

Tang, Yang ^{[1
]}

Sun, Qiyu ^{[1
]}

Zhao, Chaoqiang ^{[2
]}

Han, Qing-Long ^{[3
]}

机构：

[1] East China Univ Sci & Technol, Key Lab Smart Mfg Energy Chem Proc, Minist Educ, Shanghai 200237, Peoples R China

[2] Aviat Ind Corp China, Natl Key Lab Air Based Informat Percept & Fus, Luoyang 471000, Peoples R China

[3] Swinburne Univ Technol, Sch Sci Comp & Engn Technol, Melbourne, Vic 3122, Australia

来源：

IEEE-CAA JOURNAL OF AUTOMATICA SINICA | 2024年 / 11卷 / 05期

关键词：

Visualization; Three-dimensional displays; Semantic segmentation; Task analysis; Semantics; Annotations; Training; Computer vision; deep learning; few-shot learning; low-shot learning; semantic segmentation; zero-shot learning; VIDEO OBJECT SEGMENTATION; NETWORK; AGGREGATION; ATTENTION;

D O I：

10.1109/JAS.2023.123207

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block, and it plays a crucial role in environmental perception. Conventional learning-based visual semantic segmentation approaches count heavily on large-scale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories. This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning. The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen categories from a few labeled or even zero-labeled samples, which advances the extension to practical applications. Therefore, this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances. Specifically, the preliminaries on few/zero-shot visual semantic segmentation, including the problem definitions, typical datasets, and technical remedies, are briefly reviewed and discussed. Moreover, three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation, including image semantic segmentation, video object segmentation, and 3D segmentation. Finally, the future challenges of few/zero-shot visual semantic segmentation are discussed.

引用

页码：1106 / 1126

页数：21

共 163 条

[1]

Amac A., 2022, P IEEE CVF WINT C AP, P156

[2] Few-Shot Object Detection: A Survey [J].

Antonelli, Simone ;

Avola, Danilo ;

Cinque, Luigi ;

Crisostomi, Donato ;

Foresti, Gian Luca ;

Galasso, Fabio ;

Marini, Marco Raoul ;

Mecca, Alessio ;

Pannone, Daniele .

ACM COMPUTING SURVEYS, 2022, 54 (11S)

[3] 3D Semantic Parsing of Large-Scale Indoor Spaces [J].

Armeni, Iro ;

Sener, Ozan ;

Zamir, Amir R. ;

Jiang, Helen ;

Brilakis, Ioannis ;

Fischer, Martin ;

Savarese, Silvio .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1534-1543

[4] Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic Segmentation [J].

Baek, Donghyeon ;

Oh, Youngmin ;

Ham, Bumsub .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9516-9525

[5]

Bellver M, 2020, Arxiv, DOI arXiv:2010.00263

[6] Generating Bounding Box Supervision for Semantic Segmentation with Deep Learning [J].

Bonechi, Simone ;

Andreini, Paolo ;

Bianchini, Monica ;

Scarselli, Franco .

ARTIFICIAL NEURAL NETWORKS IN PATTERN RECOGNITION, ANNPR 2018, 2018, 11081 :190-200

[7] End-to-End Referring Video Object Segmentation with Multimodal Transformers [J].

Botach, Adam ;

Zheltonozhskii, Evgenii ;

Baskin, Chaim .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :4975-4985

[8] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need? [J].

Boudiaf, Malik ;

Kervadec, Hoel ;

Masud, Ziko Imtiaz ;

Piantanida, Pablo ;

Ben Ayed, Ismail ;

Dolz, Jose .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :13974-13983

[9]

Bucher Maxime, 2019, P NEURIPS, V32, P468

[10]

Chen K., 2013, Computer Science, P1

← 1 2 3 4 5 6 7 8 9 10 →