Semantic combined network for zero-shot scene parsing

被引：2

作者：

Wang, Yinduo ^{[1
]}

Zhang, Haofeng ^{[1
]}

Wang, Shidong ^{[2
]}

Long, Yang ^{[3
]}

Yang, Longzhi ^{[4
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China

[2] Univ East Anglia, Sch Comp Sci, Norwich, Norfolk, England

[3] Univ Durham, Dept Comp Sci, Durham, England

[4] Northumbria Univ, Dept Comp & Informat Sci, Newcastle Upon Tyne, Tyne & Wear, England

来源：

IET IMAGE PROCESSING | 2020年 / 14卷 / 04期

基金：

英国医学研究理事会; 中国国家自然科学基金;

关键词：

object recognition; unsupervised learning; learning (artificial intelligence); natural language processing; object detection; zero-shot scene parsing; image-based scene parsing; training set; discrete labels; meaningless labels; target domains; semantic combined network; SCN; scene parsing model; semantic embeddings; traditional fully supervised scene parsing methods; generalised ZSSP settings; state-of-the-art scenes; traditional fully supervised setting; original network models;

D O I：

10.1049/iet-ipr.2019.0870

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, image-based scene parsing has attracted increasing attention due to its wide application. However, conventional models can only be valid on images with the same domain of the training set and are typically trained using discrete and meaningless labels. Inspired by the traditional zero-shot learning methods which employ auxiliary side information to bridge the source and target domains, the authors propose a novel framework called semantic combined network (SCN), which aims at learning a scene parsing model only from the images of the seen classes while targeting on the unseen ones. In addition, with the assistance of semantic embeddings of classes, the proposed SCN can further improve the performances of traditional fully supervised scene parsing methods. Extensive experiments are conducted on the data set Cityscapes, and the results show that the proposed SCN can perform well on both zero-shot scene parsing (ZSSP) and generalised ZSSP settings based on several state-of-the-art scenes parsing architectures. Furthermore, the authors test the proposed model under the traditional fully supervised setting and the results show that the proposed SCN can also significantly improve the performances of the original network models.

引用

页码：757 / 765

页数：9

共 41 条

[1] Label-Embedding for Image Classification
Akata, Zeynep
Perronnin, Florent
Harchaoui, Zaid
Schmid, Cordelia
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (07) : 1425 - 1438
[2] Label-Embedding for Attribute-Based Classification
Akata, Zeynep
Perronnin, Florent
Harchaoui, Zaid
Schmid, Cordelia
[J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 819 - 826
[3] [Anonymous], IEEE C COMP VIS PATT
[4] [Anonymous], INT C LEARN REPR SAN
[5] [Anonymous], 2008, P ADV NEUR INF PROC
[6] [Anonymous], NEUROCOMPUTING
[7] [Anonymous], INT C COMP VIS VEN I
[8] [Anonymous], INT C COMP VIS BARC
[9] Synthesized Classifiers for Zero-Shot Learning
Changpinyo, Soravit
Chao, Wei-Lun
Gong, Boqing
Sha, Fei
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5327 - 5336
[10] An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild
Chao, Wei-Lun
Changpinyo, Soravit
Gong, Boqing
Sha, Fei
[J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 52 - 68

← 1 2 3 4 5 →