Knowledge Adaptation for Efficient Semantic Segmentation

被引:147
作者
He, Tong [1 ]
Shen, Chunhua [1 ]
Tian, Zhi [1 ]
Gong, Dong [1 ]
Sun, Changming [2 ]
Yan, Youliang [3 ]
机构
[1] Univ Adelaide, Adelaide, SA, Australia
[2] CSIRO, Data61, Canberra, ACT, Australia
[3] Huawei Technol, Noahs Ark Lab, Hong Kong, Peoples R China
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
关键词
D O I
10.1109/CVPR.2019.00067
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Both accuracy and efficiency are of significant importance to the task of semantic segmentation. Existing deep FCNs suffer from heavy computations due to a series of high-resolution feature maps for preserving the detailed knowledge in dense estimation. Although reducing the feature map resolution (i.e., applying a large overall stride) via subsampling operations (e.g., polling and convolution striding) can instantly increase the efficiency, it dramatically decreases the estimation accuracy. To tackle this dilemma, we propose a knowledge distillation method tailored for semantic segmentation to improve the performance of the compact FCNs with large overall stride. To handle the inconsistency between the features of the student and teacher network, we optimize the feature similarity in a transferred latent domain formulated by utilizing a pre-trained autoencoder. Moreover, an affinity distillation module is proposed to capture the long-range dependency by calculating the non-local interactions across the whole image. To validate the effectiveness of our proposed method, extensive experiments have been conducted on three popular benchmarks: Pascal VOC, Cityscapes and Pascal Context. Built upon a highly competitive baseline, our proposed method can improve the performance of a student network by 2.5% (mIOU boosts from 70.2 to 72.7 on the cityscapes test set) and can train a better compact model with only 8% float operations (FLOPS) of a model that achieves comparable performances.
引用
收藏
页码:578 / 587
页数:10
相关论文
共 34 条
  • [11] Hinton G., 2015, ABS150302531 CORR, V2
  • [12] Huang Zehao, 2017, ARXIV170701219
  • [13] Kim JH, 2018, ADV NEUR IN, V31
  • [14] RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation
    Lin, Guosheng
    Milan, Anton
    Shen, Chunhua
    Reid, Ian
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5168 - 5177
  • [15] Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation
    Lin, Guosheng
    Shen, Chunhua
    van den Hengel, Anton
    Reid, Ian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3194 - 3203
  • [16] Deep Learning Face Attributes in the Wild
    Liu, Ziwei
    Luo, Ping
    Wang, Xiaogang
    Tang, Xiaoou
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 3730 - 3738
  • [17] Long J, 2015, PROC CVPR IEEE, P3431, DOI 10.1109/CVPR.2015.7298965
  • [18] The Role of Context for Object Detection and Semantic Segmentation in the Wild
    Mottaghi, Roozbeh
    Chen, Xianjie
    Liu, Xiaobai
    Cho, Nam-Gyu
    Lee, Seong-Whan
    Fidler, Sanja
    Urtasun, Raquel
    Yuille, Alan
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 891 - 898
  • [19] Paszke A., 2016, ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
  • [20] Romero A., 2015, 3 INT C LEARN REPR I