HSNet: hierarchical semantics network for scene parsing

被引:0
作者
Xin Tan
Jiachen Xu
Ying Cao
Ke Xu
Lizhuang Ma
Rynson W. H. Lau
机构
[1] Shanghai Jiao Tong University,Department of Computer Science and Engineering
[2] City University of Hong Kong,Department of Computer Science
来源
The Visual Computer | 2023年 / 39卷
关键词
Hierarchical semantics; Scene parsing; Cross-level feature; Bidirectional network;
D O I
暂无
中图分类号
学科分类号
摘要
Scene parsing is one of the fundamental tasks in computer vision. Humans tend to perceive a scene in a hierarchical manner, i.e., first identifying the coarse category (e.g., vehicle) of a group of objects and then the fine category (e.g., bicycle, truck or car) of each of them. Despite recent tremendous progress on scene parsing, such a hierarchical semantics prior (HSP) has not been explicitly exploited. In this paper, we aim to introduce the HSP into scene parsing, by proposing a hierarchical semantics network (HSNet). Our key contribution is a bidirectional cross-level feature matching framework, which enables us to learn multi-level, hierarchy-aware features via forward feature transfer and backward feature regularization. In the forward stage, we train a coarse-to-fine module to learn fine-category features that explicitly encode hierarchical semantics information. In the backward stage, we introduce a fine-to-coarse module to collapse fine-category features to coarse-category features that are used to regularize the feature learning of our network. Experimental results on Cityscapes and Pascal Context show that our method achieves state-of-the-art performances. Our visualization also shows that our learned features capture semantic hierarchy favorably.
引用
收藏
页码:2543 / 2554
页数:11
相关论文
共 39 条
[1]  
Bilal A(2017)Do convolutional neural networks learn class hierarchy? TVCG 24 152-162
[2]  
Jourabloo A(2012)Deep hierarchies in the primate visual cortex: What can we learn for computer TPAMI 35 1847-1871
[3]  
Ye M(2022)Passenger overall comfort in high-speed railway environments based on eeg: Assessment and degradation mechanism Build. Environ. 210 108711-9098
[4]  
Liu X(2021)Night-time scene parsing with a large real dataset IEEE Trans. Image Process. 30 9085-1112
[5]  
Ren L(2021)Frnet: an end-to-end feature refinement neural network for medical image segmentation Visual Comput. 37 1101-747
[6]  
Kruger N(2018)Multi-class indoor semantic segmentation with deep structured model Visual Comput 34 735-4380
[7]  
Janssen P(2021)Weakly-supervised saliency detection via salient object subitizing IEEE Trans Circuits Syst Video Technol 31 4370-undefined
[8]  
Kalkan S(undefined)undefined undefined undefined undefined-undefined
[9]  
Lappe M(undefined)undefined undefined undefined undefined-undefined
[10]  
Leonardis A(undefined)undefined undefined undefined undefined-undefined