LDWLE: self-supervised driven low-light object detection framework

被引:0
作者
Shen, Xiaoyang [1 ,2 ]
Li, Haibin [1 ,2 ]
Li, Yaqian [1 ,2 ]
Zhang, Wenming [1 ,2 ]
机构
[1] Yanshan Univ, Coll Elect Engn, Qinhuangdao 066000, Hebei, Peoples R China
[2] Key Lab Ind Comp Control Engn Hebei Prov, Qinhuangdao 066000, Hebei, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Low-light transformation; Self-supervised learning; Jointly trained; Regularization signal;
D O I
10.1007/s40747-024-01681-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Low-light object detection involves identifying and locating objects in images captured under poor lighting conditions. It plays a significant role in surveillance and security, night pedestrian recognition, and autonomous driving, showcasing broad application prospects. Most existing object detection algorithms and datasets are designed for normal lighting conditions, leading to a significant drop in detection performance when applied to low-light environments. To address this issue, we propose a Low-Light Detection with Low-Light Enhancement (LDWLE) framework. LDWLE is an encoder-decoder architecture where the encoder transforms the raw input data into a compact, abstract representation (encoding), and the decoder gradually generates the target output format from the representation produced by the encoder. Specifically, during training, low-light images are input into the encoder, which produces feature representations that are decoded by two separate decoders: an object detection decoder and a low-light image enhancement decoder. Both decoders share the same encoder and are trained jointly. Throughout the training process, the two decoders optimize each other, guiding the low-light image enhancement towards improvements that benefit object detection. If the input image is normally lit, it first passes through a low-light image conversion module to be transformed into a low-light image before being fed into the encoder. If the input image is already a low-light image, it is directly input into the encoder. During the testing phase, the model can be evaluated in the same way as a standard object detection algorithm. Compared to existing object detection algorithms, LDWLE can train a low-light robust object detection model using standard, normally lit object detection datasets. Additionally, LDWLE is a versatile training framework that can be implemented on most one-stage object detection algorithms. These algorithms typically consist of three components: the backbone, neck, and head. In this framework, the backbone functions as the encoder, while the neck and head form the object detection decoder. Extensive experiments on the COCO, VOC, and ExDark datasets have demonstrated the effectiveness of LDWLE in low-light object detection. In quantitative measurements, it achieves an AP of 25.5 and 38.4 on the synthetic datasets COCO-d and VOC-d, respectively, and achieves the best AP of 30.5 on the real-world dataset ExDark. In qualitative measurements, LDWLE can accurately detect most objects on both public real-world low-light datasets and self-collected ones, demonstrating strong adaptability to varying lighting conditions and multi-scale objects.
引用
收藏
页数:18
相关论文
共 41 条
  • [21] Lightweight two-stage transformer for low-light image enhancement and object detection
    Kou, Kangkang
    Yin, Xiangchen
    Gao, Xin
    Nie, Fuhui
    Liu, Jing
    Zhang, Guoying
    [J]. DIGITAL SIGNAL PROCESSING, 2024, 150
  • [22] Local Texture Estimator for Implicit Representation Function
    Lee, Jaewon
    Jin, Kyong Hwan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1928 - 1937
  • [23] Microsoft COCO: Common Objects in Context
    Lin, Tsung-Yi
    Maire, Michael
    Belongie, Serge
    Hays, James
    Perona, Pietro
    Ramanan, Deva
    Dollar, Piotr
    Zitnick, C. Lawrence
    [J]. COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 740 - 755
  • [24] Getting to know low-light images with the Exclusively Dark dataset
    Loh, Yuen Peng
    Chan, Chee Seng
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 178 : 30 - 42
  • [25] LLNet: A deep autoencoder approach to natural low-light image enhancement
    Lore, Kin Gwn
    Akintayo, Adedotun
    Sarkar, Soumik
    [J]. PATTERN RECOGNITION, 2017, 61 : 650 - 662
  • [26] Lv F., 2018, BMVC, P4
  • [27] Mijwil M.M., 2023, Mesopotam. J. Comput. Sci., V2023, P32, DOI [10.58496/MJCSC/2023/005, DOI 10.58496/MJCSC/2023/005]
  • [28] Mohammed M.A., 2023, Babylon. J. Artif. Intell., V2023, P17, DOI [10.58496/bjai/2023/005, DOI 10.58496/BJAI/2023/005]
  • [29] U2-Net: Going deeper with nested U-structure for salient object detection
    Qin, Xuebin
    Zhang, Zichen
    Huang, Chenyang
    Dehghan, Masood
    Zaiane, Osmar R.
    Jagersand, Martin
    [J]. PATTERN RECOGNITION, 2020, 106 (106)
  • [30] You Only Look Once: Unified, Real-Time Object Detection
    Redmon, Joseph
    Divvala, Santosh
    Girshick, Ross
    Farhadi, Ali
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 779 - 788