共 50 条
- [2] Structured Attention Knowledge Distillation for Lightweight Networks PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 1726 - 1730
- [3] A-A KD: Attention and Activation Knowledge Distillation 2021 IEEE SEVENTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2021), 2021, : 57 - 60
- [4] Knowledge Fusion Distillation: Improving Distillation with Multi-scale Attention Mechanisms Neural Processing Letters, 2023, 55 : 6165 - 6180
- [6] What Can Attention Module Do in Knowledge Distillation? 2021 4TH INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION ENGINEERING (RCAE 2021), 2021, : 196 - 200
- [8] Dynamic Refining Knowledge Distillation Based on Attention Mechanism PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2022, 13630 : 45 - 58
- [9] Sparse Mixture of Experts Language Models Excel in Knowledge Distillation NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 80 - 91