CGNet: A Light-Weight Context Guided Network for Semantic Segmentation

被引：471

作者：

Wu, Tianyi ^{[1
,2
]}

Tang, Sheng ^{[1
,2
]}

Zhang, Rui ^{[1
,2
]}

Cao, Juan ^{[1
,2
]}

Zhang, Yongdong ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2021年 / 30卷

基金：

中国国家自然科学基金;

关键词：

Semantics; Image segmentation; Context modeling; Computer architecture; Computational modeling; Mobile handsets; Predictive models; Semantic segmentation; surrounding context; global context; context guided;

D O I：

10.1109/TIP.2020.3042065

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The demand of applying semantic segmentation model on mobile devices has been increasing rapidly. Current state-of-the-art networks have enormous amount of parameters hence unsuitable for mobile devices, while other small memory footprint models follow the spirit of classification network and ignore the inherent characteristic of semantic segmentation. To tackle this problem, we propose a novel Context Guided Network (CGNet), which is a light-weight and efficient network for semantic segmentation. We first propose the Context Guided (CG) block, which learns the joint feature of both local feature and surrounding context effectively and efficiently, and further improves the joint feature with the global context. Based on the CG block, we develop CGNet which captures contextual information in all stages of the network. CGNet is specially tailored to exploit the inherent property of semantic segmentation and increase the segmentation accuracy. Moreover, CGNet is elaborately designed to reduce the number of parameters and save memory footprint. Under an equivalent number of parameters, the proposed CGNet significantly outperforms existing light-weight segmentation networks. Extensive experiments on Cityscapes and CamVid datasets verify the effectiveness of the proposed approach. Specifically, without any post-processing and multi-scale testing, the proposed CGNet achieves 64.8% mean IoU on Cityscapes with less than 0.5 M parameters.

引用

页码：1169 / 1179

页数：11

共 59 条

[1]

[Anonymous], 2016, 4 INT C LEARN REPR S

[2]

[Anonymous], 2017, Computing Research Repository

[3]

[Anonymous], 2016, MLITS NIPS WORKSHOP

[4]

[Anonymous], 2014, ABS14090473 CORR

[5] A survey of augmented reality [J].

Azuma, RT .

PRESENCE-VIRTUAL AND AUGMENTED REALITY, 1997, 6 (04) :355-385

[6] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[7] Segmentation and Recognition Using Structure from Motion Point Clouds [J].

Brostow, Gabriel J. ;

Shotton, Jamie ;

Fauqueur, Julien ;

Cipolla, Roberto .

COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 :44-+

[8] Fast, Exact and Multi-scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs [J].

Chandra, Siddhartha ;

Kokkinos, Iasonas .

COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 :402-418

[9]

Chen LC, 2018, ADV NEUR IN, V31

[10] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

← 1 2 3 4 5 6 →