KNOWLEDGE DISTILLATION WITH CATEGORY-AWARE ATTENTION AND DISCRIMINANT LOGIT LOSSES

被引：4

作者：

Jiang, Lei ^{[1
]}

Zhou, Wengang ^{[1
]}

Li, Houqiang ^{[1
]}

机构：

[1] Univ Sci & Technol China, EEIS Dept, CAS Key Lab Technol Geospatial Informat Proc & Ap, Hefei, Anhui, Peoples R China

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME) | 2019年

关键词：

knowledge distillation; attention transfer; model compression;

D O I：

10.1109/ICME.2019.00308

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Deep neural networks (DNNs) usually suffer large amount of storage and computation, limiting their deployment on resource constrained platforms. Knowledge distillation is an effective way to address the above limitation by transferring knowledge from a large while accurate teacher model to a small yet fast student model. In this paper, we propose two objective functions to optimize the knowledge transferring process. First, we propose a category-aware attention loss which works at the convolutional feature level and catches object localization information. Second, we propose a discriminant logit loss at fully-connected feature level to capture classification information. The combined two objective functions are able to integrate different level features and guide the training of the student. We demonstrate the effectiveness of our approach on several CNN models across various datasets, and show consistent performance gain with the proposed method.

引用

页码：1792 / 1797

页数：6

共 25 条

[1]

[Anonymous], 2018, IEEE INT CON MULTI

[2] Self-Taught Object Localization with Deep Networks [J].

Bazzani, Loris ;

Bergamo, Alessandro ;

Anguelov, Dragomir ;

Torresani, Lorenzo .

2016 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2016), 2016,

[3]

Boski M, 2017, 2017 10TH INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL (ND) SYSTEMS (NDS)

[4]

Bucilua C., 2006, P 12 ACM SIGKDD INT, P535

[5]

Courbariaux M., 2016, ARXIV160202830

[6]

Gao Wen, 2018, ICME, P1

[7]

He K., 2016, IEEE C COMPUT VIS PA, DOI [10.1007/978-3-319-46493-0_38, DOI 10.1007/978-3-319-46493-0_38, DOI 10.1109/CVPR.2016.90]

[8]

He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]

[9]

Hinton GE, 2015, ARXIV

[10] Densely Connected Convolutional Networks [J].

Huang, Gao ;

Liu, Zhuang ;

van der Maaten, Laurens ;

Weinberger, Kilian Q. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269

← 1 2 3 →