Knowledge Adaptation for Efficient Semantic Segmentation

被引：170

作者：

He, Tong ^{[1
]}

Shen, Chunhua ^{[1
]}

Tian, Zhi ^{[1
]}

Gong, Dong ^{[1
]}

Sun, Changming ^{[2
]}

Yan, Youliang ^{[3
]}

机构：

[1] Univ Adelaide, Adelaide, SA, Australia

[2] CSIRO, Data61, Canberra, ACT, Australia

[3] Huawei Technol, Noahs Ark Lab, Hong Kong, Peoples R China

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

D O I：

10.1109/CVPR.2019.00067

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Both accuracy and efficiency are of significant importance to the task of semantic segmentation. Existing deep FCNs suffer from heavy computations due to a series of high-resolution feature maps for preserving the detailed knowledge in dense estimation. Although reducing the feature map resolution (i.e., applying a large overall stride) via subsampling operations (e.g., polling and convolution striding) can instantly increase the efficiency, it dramatically decreases the estimation accuracy. To tackle this dilemma, we propose a knowledge distillation method tailored for semantic segmentation to improve the performance of the compact FCNs with large overall stride. To handle the inconsistency between the features of the student and teacher network, we optimize the feature similarity in a transferred latent domain formulated by utilizing a pre-trained autoencoder. Moreover, an affinity distillation module is proposed to capture the long-range dependency by calculating the non-local interactions across the whole image. To validate the effectiveness of our proposed method, extensive experiments have been conducted on three popular benchmarks: Pascal VOC, Cityscapes and Pascal Context. Built upon a highly competitive baseline, our proposed method can improve the performance of a student network by 2.5% (mIOU boosts from 70.2 to 72.7 on the cityscapes test set) and can train a better compact model with only 8% float operations (FLOPS) of a model that achieves comparable performances.

引用

页码：578 / 587

页数：10

共 34 条

[11]

Hinton G., 2015, ABS150302531 CORR, V2

[12]

Huang Zehao, 2017, ARXIV170701219

[13]

Kim JH, 2018, ADV NEUR IN, V31

[14] RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation [J].

Lin, Guosheng ;

Milan, Anton ;

Shen, Chunhua ;

Reid, Ian .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5168-5177

[15] Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation [J].

Lin, Guosheng ;

Shen, Chunhua ;

van den Hengel, Anton ;

Reid, Ian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3194-3203

[16] Deep Learning Face Attributes in the Wild [J].

Liu, Ziwei ;

Luo, Ping ;

Wang, Xiaogang ;

Tang, Xiaoou .

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :3730-3738

[17]

Long J, 2015, PROC CVPR IEEE, P3431, DOI 10.1109/CVPR.2015.7298965

[18] The Role of Context for Object Detection and Semantic Segmentation in the Wild [J].

Mottaghi, Roozbeh ;

Chen, Xianjie ;

Liu, Xiaobai ;

Cho, Nam-Gyu ;

Lee, Seong-Whan ;

Fidler, Sanja ;

Urtasun, Raquel ;

Yuille, Alan .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :891-898

[19]

Paszke A., 2016, ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation

[20]

Romero A., 2015, 3 INT C LEARN REPR I

← 1 2 3 4 →