Multi-view knowledge distillation for efficient semantic segmentation

被引:4
作者
Wang, Chen [1 ]
Zhong, Jiang [1 ]
Dai, Qizhu [1 ]
Qi, Yafei [2 ]
Shi, Fengyuan [3 ]
Fang, Bin [1 ]
Li, Xue [4 ]
机构
[1] Chongqing Univ, Sch Comp Sci, Chongqing 400044, Peoples R China
[2] Cent South Univ, Sch Comp Sci & Engn, Changsha 410083, Peoples R China
[3] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110819, Peoples R China
[4] Univ Queensland, Sch Informat Technol & Elect Engn, Brisbane, Qld 4072, Australia
基金
中国国家自然科学基金;
关键词
Multi-view learning; Knowledge distillation; Knowledge aggregation; Semantic segmentation; ENSEMBLE;
D O I
10.1007/s11554-023-01296-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Current state-of-the-art semantic segmentation models achieve remarkable success in segmentation accuracy. However, the huge model size and computing cost restrict their applications on low-latency online systems or devices. Knowledge distillation has been one popular solution for compressing large-scale segmentation models, which train a small segmentation model from a large teacher model. However, one teacher model's knowledge may be insufficiently diverse to train an accurate student model. Meanwhile, the student model may inherit bias from the teacher model. This paper proposes a multi-view knowledge distillation framework called MVKD for efficient semantic segmentation. MVKD could aggregate the multi-view knowledge from multiple teacher models and transfer the multi-view knowledge to the student model. In MVKD, we introduce one multi-view co-tuning strategy to acquire uniformity among the multi-view knowledge in features from different teachers. In addition, we propose a multi-view feature distillation loss and a multi-view output distillation loss to transfer the multi-view knowledge in the features and outputs from multiple teachers to the student. We evaluate the proposed MVKD on three benchmark datasets, Cityscapes, CamVid, and Pascal VOC 2012. Experimental results demonstrate the effectiveness of the proposed MVKD in compressing semantic segmentation models.
引用
收藏
页数:11
相关论文
共 38 条
  • [1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
    Badrinarayanan, Vijay
    Kendall, Alex
    Cipolla, Roberto
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
  • [2] Chen L.C., 2017, PROC IEEE C COMP VIS
  • [3] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [4] Diversity with Cooperation: Ensemble Methods for Few-Shot Classification
    Dvornik, Nikita
    Schmid, Cordelia
    Mairal, Julien
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 3722 - 3730
  • [5] Dual Attention Network for Scene Segmentation
    Fu, Jun
    Liu, Jing
    Tian, Haijie
    Li, Yong
    Bao, Yongjun
    Fang, Zhiwei
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
  • [6] Efficient Knowledge Distillation from an Ensemble of Teachers
    Fukuda, Takashi
    Suzuki, Masayuki
    Kurata, Gakuto
    Thomas, Samuel
    Cui, Jia
    Ramabhadran, Bhuvana
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3697 - 3701
  • [7] Knowledge Adaptation for Efficient Semantic Segmentation
    He, Tong
    Shen, Chunhua
    Tian, Zhi
    Gong, Dong
    Sun, Changming
    Yan, Youliang
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 578 - 587
  • [8] Hinton G, 2015, Arxiv, DOI [arXiv:1503.02531, DOI 10.48550/ARXIV.1503.02531]
  • [9] CCNet: Criss-Cross Attention for Semantic Segmentation
    Huang, Zilong
    Wang, Xinggang
    Huang, Lichao
    Huang, Chang
    Wei, Yunchao
    Liu, Wenyu
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 603 - 612
  • [10] Jain J., 2021, arXiv