Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An Overview

被引：28

作者：

Ren, Wenqi ^{[1
]}

Tang, Yang ^{[1
]}

Sun, Qiyu ^{[1
]}

Zhao, Chaoqiang ^{[2
]}

Han, Qing-Long ^{[3
]}

机构：

[1] East China Univ Sci & Technol, Key Lab Smart Mfg Energy Chem Proc, Minist Educ, Shanghai 200237, Peoples R China

[2] Aviat Ind Corp China, Natl Key Lab Air Based Informat Percept & Fus, Luoyang 471000, Peoples R China

[3] Swinburne Univ Technol, Sch Sci Comp & Engn Technol, Melbourne, Vic 3122, Australia

来源：

IEEE-CAA JOURNAL OF AUTOMATICA SINICA | 2024年 / 11卷 / 05期

关键词：

Visualization; Three-dimensional displays; Semantic segmentation; Task analysis; Semantics; Annotations; Training; Computer vision; deep learning; few-shot learning; low-shot learning; semantic segmentation; zero-shot learning; VIDEO OBJECT SEGMENTATION; NETWORK; AGGREGATION; ATTENTION;

D O I：

10.1109/JAS.2023.123207

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visual semantic segmentation aims at separating a visual sample into diverse blocks with specific semantic attributes and identifying the category for each block, and it plays a crucial role in environmental perception. Conventional learning-based visual semantic segmentation approaches count heavily on large-scale training data with dense annotations and consistently fail to estimate accurate semantic labels for unseen categories. This obstruction spurs a craze for studying visual semantic segmentation with the assistance of few/zero-shot learning. The emergence and rapid progress of few/zero-shot visual semantic segmentation make it possible to learn unseen categories from a few labeled or even zero-labeled samples, which advances the extension to practical applications. Therefore, this paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space and explores the commonalities and discrepancies of technical settlements under different segmentation circumstances. Specifically, the preliminaries on few/zero-shot visual semantic segmentation, including the problem definitions, typical datasets, and technical remedies, are briefly reviewed and discussed. Moreover, three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation, including image semantic segmentation, video object segmentation, and 3D segmentation. Finally, the future challenges of few/zero-shot visual semantic segmentation are discussed.

引用

页码：1106 / 1126

页数：21

共 163 条

[21] The Pascal Visual Object Classes (VOC) Challenge [J].