Scene Text Detection in Foggy Weather Utilizing Knowledge Distillation of Diffusion Models

被引：0

作者：

Liu, Zhaoxi ^{[1
]}

Zhou, Gang ^{[1
]}

Jia, Zhenhong ^{[1
]}

Shi, Fei ^{[1
]}

机构：

[1] Xinjiang Univ, Sch Comp Sci & Technol, Urumqi 830046, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2025年 / 32卷

基金：

中国国家自然科学基金;

关键词：

Diffusion models; Feature extraction; Text detection; Meteorology; Training; Text to image; Convolution; Visualization; Kernel; Semantics; knowledge distillation; adverse weather; scene text detection;

D O I：

10.1109/LSP.2025.3540371

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Adverse weather conditions can significantly hinder the performance of deep learning-based object detection models. Traditional approaches often rely on image restoration techniques to enhance the quality of degraded images prior to detection. However, these methods frequently struggle to balance image enhancement and detection tasks effectively, often overlooking latent information that could be beneficial for detection. To address these challenges, we propose a novel framework: Knowledge Distillation based on Diffusion Models (KDDM). This framework incorporates a Dehaze Network (DN), which employs large kernel convolution to remove weather-specific artifacts, thereby revealing more latent information. The DN, together with a text detector, forms an end-to-end scene text detection network, acting as the student network. Additionally, the nuanced internal representations of text-to-image diffusion models adeptly capture and integrate higher-order visual semantic concepts. Given the rich textual and visual content inherent in scene text, there is a fundamental connection to text-to-image diffusion models. As such, we utilize diffusion models as a teacher network to distill high-level visual semantic knowledge into the student network. Notably, we introduce an innovative distillation technique using a "Threshold_Mask", which ensures that the student network focuses on text regions while minimizing interference from irrelevant background elements. Comprehensive experimental evaluations demonstrate that our KDDM framework significantly outperforms baseline models under foggy weather conditions, marking a substantial advancement in the field.

引用

页码：996 / 1000

页数：5

共 20 条

[1] Instance Segmentation Network With Self-Distillation for Scene Text Detection
Yang, Peng
Yang, Guowei
Gong, Xun
Wu, Pingping
Han, Xu
Wu, Jiasong
Chen, Caisen
IEEE ACCESS, 2020, 8 : 45825 - 45836
[2] Efficient Scene Text Detection in Images with Network Pruning and Knowledge Distillation
Orenbas, Halit
Oymagil, Anil
Baydar, Melih
29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
[3] A Fast Scene Text Detector Using Knowledge Distillation
Yang, Peng
Zhang, Fanlong
Yang, Guowei
IEEE ACCESS, 2019, 7 : 22588 - 22598
[4] Robust Document Presentation Attack Detection via Diffusion Models and Knowledge Distillation
Li, Bokang
Chen, Changsheng
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT XI, 2025, 15041 : 278 - 291
[5] Knowledge distillation for object detection with diffusion model
Zhang, Yi
Long, Junzong
Li, Chunrui
NEUROCOMPUTING, 2025, 636
[6] Weather recognition combining improved ConvNeXt models with knowledge distillation
Liu L.
Xi S.
Deng Z.
Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2023, 31 (14): : 2123 - 2134
[7] Self-distillation via Entropy Transfer for Scene Text Detection
Chen, Jian-Wei
Yang, Fan
Lai, Yong-Xuan
Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (11): : 2128 - 2139
[8] Enhancing scene text detectors with realistic text image synthesis using diffusion models
Fu, Ling
Wu, Zijie
Zhu, Yingying
Liu, Yuliang
Bai, Xiang
COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 250
[9] Domain Adaptation Curriculum Learning for Scene Text Detection in Inclement Weather Conditions
Liu, Yangxin
Zhou, Gang
Tian, Jiakun
Deng, En
Lin, Meng
Jia, Zhenhong
IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2024, 19 (08) : 1337 - 1344
[10] TextFuse: Fusing Deep Scene Text Detection Models for Enhanced Performance
Xianjin Shi
Guowen Peng
Xiajiong Shen
Chongsheng Zhang
Multimedia Tools and Applications, 2024, 83 : 22433 - 22454

← 1 2 →