Adaptive Discrepancy Masked Distillation for remote sensing object detection

被引:0
|
作者
Li, Cong [1 ]
Cheng, Gong [1 ]
Han, Junwei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Knowledge distillation; Remote sensing images;
D O I
10.1016/j.isprsjprs.2025.02.006
中图分类号
P9 [自然地理学];
学科分类号
0705 ; 070501 ;
摘要
Knowledge distillation (KD) has become a promising technique for obtaining a performant student detector in remote sensing images by inheriting the knowledge from a heavy teacher detector. Unfortunately, not every pixel contributes (even detrimental) equally to the final KD performance. To dispel this problem, the existing methods usually derived a distillation mask to stress the valuable regions during KD. In this paper, we put forth Adaptive Discrepancy Masked Distillation (ADMD), a novel KD framework to explicitly localize the beneficial pixels. Our approach stems from the observation that the feature discrepancy between the teacher and student is the essential reason for their performance gap. With this regard, we make use of the feature discrepancy to determine which location causes the student to lag behind the teacher and then regulate the student to assign higher learning priority to them. Furthermore, we empirically observe that the discrepancy masked distillation leads to loss vanishing in later KD stages. To combat this issue, we introduce a simple yet practical weight- increasing module, in which the magnitude of KD loss is adaptively adjusted to ensure KD steadily contributes to student optimization. Comprehensive experiments on DIOR and DOTA across various dense detectors show that our ADMD consistently harvests remarkable performance gains, particularly under a prolonged distillation schedule, and exhibits superiority over state-of-the-art counterparts. Code and trained checkpoints will be made available at https://github.com/swift1988.
引用
收藏
页码:54 / 63
页数:10
相关论文
empty
未找到相关数据