Knowledge Adaptation for Efficient Semantic Segmentation

被引:147
作者
He, Tong [1 ]
Shen, Chunhua [1 ]
Tian, Zhi [1 ]
Gong, Dong [1 ]
Sun, Changming [2 ]
Yan, Youliang [3 ]
机构
[1] Univ Adelaide, Adelaide, SA, Australia
[2] CSIRO, Data61, Canberra, ACT, Australia
[3] Huawei Technol, Noahs Ark Lab, Hong Kong, Peoples R China
来源
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年
关键词
D O I
10.1109/CVPR.2019.00067
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Both accuracy and efficiency are of significant importance to the task of semantic segmentation. Existing deep FCNs suffer from heavy computations due to a series of high-resolution feature maps for preserving the detailed knowledge in dense estimation. Although reducing the feature map resolution (i.e., applying a large overall stride) via subsampling operations (e.g., polling and convolution striding) can instantly increase the efficiency, it dramatically decreases the estimation accuracy. To tackle this dilemma, we propose a knowledge distillation method tailored for semantic segmentation to improve the performance of the compact FCNs with large overall stride. To handle the inconsistency between the features of the student and teacher network, we optimize the feature similarity in a transferred latent domain formulated by utilizing a pre-trained autoencoder. Moreover, an affinity distillation module is proposed to capture the long-range dependency by calculating the non-local interactions across the whole image. To validate the effectiveness of our proposed method, extensive experiments have been conducted on three popular benchmarks: Pascal VOC, Cityscapes and Pascal Context. Built upon a highly competitive baseline, our proposed method can improve the performance of a student network by 2.5% (mIOU boosts from 70.2 to 72.7 on the cityscapes test set) and can train a better compact model with only 8% float operations (FLOPS) of a model that achieves comparable performances.
引用
收藏
页码:578 / 587
页数:10
相关论文
共 34 条
  • [1] [Anonymous], 2011, ICCV
  • [2] [Anonymous], 2017, IEEE T PATTERN ANAL
  • [3] Ayinde Babajide O., 2018, IEEE T NEURAL NETW L
  • [4] Chen L.C., 2018, P EUR C COMP VIS
  • [5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [6] Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
  • [7] Xception: Deep Learning with Depthwise Separable Convolutions
    Chollet, Francois
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
  • [8] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [9] Everingham M., 2014, International Journal of Computer Vision
  • [10] He K., 2016, IEEE C COMPUT VIS PA, DOI [10.1007/978-3-319-46493-0_38, DOI 10.1007/978-3-319-46493-0_38, DOI 10.1109/CVPR.2016.90]