Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

被引：0

作者：

Zhang, Bowen ^{[1
]}

Liu, Yifan ^{[1
]}

Tian, Zhi ^{[1
]}

Shen, Chunhua ^{[1
]}

机构：

[1] Univ Adelaide, Adelaide, SA, Australia

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semantic segmentation requires per-pixel prediction for a given image. Typically, the output resolution of a segmentation network is severely reduced due to the downsampling operations in the CNN backbone. Most previous methods employ upsampling decoders to recover the spatial resolution. Various decoders were designed in the literature. Here, we propose a novel decoder, termed dynamic neural representational decoder (NRD), which is simple yet significantly more efficient. As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks. This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient. Furthermore, these neural representations are dynamically generated and conditioned on the outputs of the encoder networks. The desired semantic labels can be efficiently decoded from the neural representations, resulting in high-resolution semantic segmentation predictions. We empirically show that our proposed decoder outperforms the decoder in DeeplabV3+ with only similar to 30% computational complexity, and achieves competitive performance with the methods using dilated encoders with only similar to 15% computational costs. Experiments on the Cityscapes, ADE20K, and PASCAL Context datasets demonstrate the effectiveness and efficiency of our proposed method.

引用

页数：12

共 43 条

[1]

[Anonymous], Vision and Pattern Recognition (CVPR)

[2]

[Anonymous], 2016, Advances in Neural Information Processing Systems

[3]

[Anonymous], 2017, P IEEE C COMP VIS PA

[4] Higher Order Conditional Random Fields in Deep Neural Networks [J].

Arnab, Anurag ;

Jayasumana, Sadeep ;

Zheng, Shuai ;

Torr, Philip H. S. .

COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 :524-540

[5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[6]

Chen Liang-Chieh, 2018, P EUR C COMP VIS

[7]

Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709

[8] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[9] Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation [J].

Ding, Henghui ;

Jiang, Xudong ;

Shuai, Bing ;

Liu, Ai Qun ;

Wang, Gang .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2393-2402

[10] Dual Attention Network for Scene Segmentation [J].

Fu, Jun ;

Liu, Jing ;

Tian, Haijie ;

Li, Yong ;

Bao, Yongjun ;

Fang, Zhiwei ;

Lu, Hanqing .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149

← 1 2 3 4 5 →