Knowledge Amalgamation for Object Detection With Transformers

被引：8

作者：

Zhang, Haofei ^{[1
]}

Mao, Feng ^{[2
]}

Xue, Mengqi ^{[3
]}

Fang, Gongfan ^{[4
]}

Feng, Zunlei ^{[5
]}

Song, Jie ^{[5
]}

Song, Mingli ^{[6
,7
,8
,9
]}

机构：

[1] Zhejiang Univ, Coll Comp Sci, Hangzhou 310027, Peoples R China

[2] Alibaba Grp, Xixi Campus, Hangzhou 311121, Peoples R China

[3] Hangzhou City Univ, Sch Comp & Comp Sci, Hangzhou 310028, Peoples R China

[4] Natl Univ Singapore, Elect & Comp Engn, Singapore 119077, Singapore

[5] Zhejiang Univ, Coll Software Technol, Hangzhou 310027, Peoples R China

[6] Zhejiang Univ, Shanghai Inst Adv Study, Shanghai 200080, Peoples R China

[7] Zhejiang Univ, Coll Comp Sci, Hangzhou 310027, Peoples R China

[8] Zhejiang Univ, Zhejiang Prov Key Lab Serv Robot, Hangzhou 310027, Peoples R China

[9] ZJU Bangsun Joint Res Ctr, Hangzhou 310058, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2023年 / 32卷

基金：

中国国家自然科学基金;

关键词：

Transformers; Task analysis; Object detection; Detectors; Training; Computer architecture; Feature extraction; Model reusing; knowledge amalgamation; knowledge distillation; object detection; vision transformers;

D O I：

10.1109/TIP.2023.3263105

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge amalgamation (KA) is a novel deep model reusing task aiming to transfer knowledge from several well-trained teachers to a multi-talented and compact student. Currently, most of these approaches are tailored for convolutional neural networks (CNNs). However, there is a tendency that Transformers, with a completely different architecture, are starting to challenge the domination of CNNs in many computer vision tasks. Nevertheless, directly applying the previous KA methods to Transformers leads to severe performance degradation. In this work, we explore a more effective KA scheme for Transformer-based object detection models. Specifically, considering the architecture characteristics of Transformers, we propose to dissolve the KA into two aspects: sequence-level amalgamation (SA) and task-level amalgamation (TA). In particular, a hint is generated within the sequence-level amalgamation by concatenating teacher sequences instead of redundantly aggregating them to a fixed-size one as previous KA approaches. Besides, the student learns heterogeneous detection tasks through soft targets with efficiency in the task-level amalgamation. Extensive experiments on PASCAL VOC and COCO have unfolded that the sequence-level amalgamation significantly boosts the performance of students, while the previous methods impair the students. Moreover, the Transformer-based students excel in learning amalgamated knowledge, as they have mastered heterogeneous detection tasks rapidly and achieved superior or at least comparable performance to those of the teachers in their specializations.

引用

页码：2093 / 2106

页数：14

共 50 条

[31] Incremental Object Detection via Meta-Learning
Joseph, K. J.
Rajasegaran, Jathushan
Khan, Salman
Khan, Fahad Shahbaz
Balasubramanian, Vineeth N.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 9209 - 9216
[32] Few-Shot Object Detection: A Comprehensive Survey
Koehler, Mona
Eisenbach, Markus
Gross, Horst-Michael
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 11958 - 11978
[33] Knowledge Distillation in Object Detection for Resource-Constrained Edge Computing
Setyanto, Arief
Sasongko, Theopilus Bayu
Fikri, Muhammad Ainul
Ariatmanto, Dhani
Agastya, I. Made Artha
Rachmanto, Rakandhiya Daanii
Ardana, Affan
Kim, In Kee
IEEE ACCESS, 2025, 13 : 18200 - 18214
[34] A Global Object Disappearance Attack Scenario on Object Detection
Li, Zhiang
Xiao, Xiaoling
IEEE ACCESS, 2024, 12 : 104938 - 104947
[35] Video Object Detection Guided by Object Blur Evaluation
Wu, Yujie
Zhang, Hong
Li, Yawei
Yang, Yifan
Yuan, Ding
IEEE ACCESS, 2020, 8 : 208554 - 208565
[36] HODN: Disentangling Human-Object Feature for HOI Detection
Fang, Shuman
Lin, Zhiwen
Yan, Ke
Li, Jie
Lin, Xianming
Ji, Rongrong
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 3125 - 3136
[37] Distilling the Knowledge in Object Detection with Adaptive Balance
Lu, Hongyun
Liu, Zhi
Zhang, Mengmeng
2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 272 - 275
[38] Shared Knowledge Distillation Network for Object Detection
Guo, Zhen
Zhang, Pengzhou
Liang, Peng
ELECTRONICS, 2024, 13 (08)
[39] New Knowledge Distillation for Incremental Object Detection
Chen, Li
Yu, Chunyan
Chen, Lvcai
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[40] Foreground separation knowledge distillation for object detection
Li, Chao
Liu, Rugui
Quan, Zhe
Hu, Pengpeng
Sun, Jun
PEERJ COMPUTER SCIENCE, 2024, 10

← 1 2 3 4 5 →