共 50 条
Adaptive lightweight network construction method for Self-Knowledge Distillation
被引:0
|作者:
Lu, Siyuan
[1
]
Zeng, Weiliang
[1
]
Li, Xueshi
[1
]
Ou, Jiajun
[1
]
机构:
[1] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Guangdong, Peoples R China
来源:
关键词:
Deep learning;
Knowledge Distillation;
Neural network architecture design;
D O I:
10.1016/j.neucom.2025.129477
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Self-Knowledge Distillation (self-KD) has become a promising method for neural network compression due to its advances in computational efficiency. Nevertheless, its applicability is constrained by the inherent inflexibility of the network architecture and the absence of quantitative metrics to evaluate the distillability of the architecture. To address these problems, a two-stage adaptive dynamic distillation network framework (ADDN) is proposed to adapt the architecture based on the distillability of the current architecture, containing a hypernetwork topology construction stage and a subnetwork training stage. To evaluate the distillability of current architectures without necessitating extensive training, we propose a set of low-cost distillability metrics that evaluate architectures from the perspective of architectural similarity and clustering ability. Furthermore, to simplify the hypernetwork structure and reduce the complexity of the construction process, a hierarchical filtering module is introduced to dynamically refine and remove candidate operations within the architecture incrementally, contingent upon the distillability of the current architecture. To validate the effectiveness of our approach, we conduct extensive experiments on various image classification datasets and compare with current works. Experimental results demonstrate that the self-knowledge distillation network architecture obtained by our proposed methodology simultaneously attains superior distillability and efficiency while significantly curtailing construction expenses.
引用
收藏
页数:14
相关论文