KNOWLEDGE DISTILLATION WITH CATEGORY-AWARE ATTENTION AND DISCRIMINANT LOGIT LOSSES

被引:4
作者
Jiang, Lei [1 ]
Zhou, Wengang [1 ]
Li, Houqiang [1 ]
机构
[1] Univ Sci & Technol China, EEIS Dept, CAS Key Lab Technol Geospatial Informat Proc & Ap, Hefei, Anhui, Peoples R China
来源
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME) | 2019年
关键词
knowledge distillation; attention transfer; model compression;
D O I
10.1109/ICME.2019.00308
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep neural networks (DNNs) usually suffer large amount of storage and computation, limiting their deployment on resource constrained platforms. Knowledge distillation is an effective way to address the above limitation by transferring knowledge from a large while accurate teacher model to a small yet fast student model. In this paper, we propose two objective functions to optimize the knowledge transferring process. First, we propose a category-aware attention loss which works at the convolutional feature level and catches object localization information. Second, we propose a discriminant logit loss at fully-connected feature level to capture classification information. The combined two objective functions are able to integrate different level features and guide the training of the student. We demonstrate the effectiveness of our approach on several CNN models across various datasets, and show consistent performance gain with the proposed method.
引用
收藏
页码:1792 / 1797
页数:6
相关论文
共 25 条
  • [1] [Anonymous], 2018, IEEE INT CON MULTI
  • [2] Self-Taught Object Localization with Deep Networks
    Bazzani, Loris
    Bergamo, Alessandro
    Anguelov, Dragomir
    Torresani, Lorenzo
    [J]. 2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,
  • [3] Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)
  • [4] Bucilua C., 2006, P 12 ACM SIGKDD INT, P535
  • [5] Courbariaux M., 2016, ARXIV160202830
  • [6] Gao Wen, 2018, ICME, P1
  • [7] He K., 2016, IEEE C COMPUT VIS PA, DOI [10.1007/978-3-319-46493-0_38, DOI 10.1007/978-3-319-46493-0_38, DOI 10.1109/CVPR.2016.90]
  • [8] He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]
  • [9] Hinton GE, 2015, ARXIV
  • [10] Densely Connected Convolutional Networks
    Huang, Gao
    Liu, Zhuang
    van der Maaten, Laurens
    Weinberger, Kilian Q.
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 2261 - 2269