LDWLE: self-supervised driven low-light object detection framework

被引:0
作者
Shen, Xiaoyang [1 ,2 ]
Li, Haibin [1 ,2 ]
Li, Yaqian [1 ,2 ]
Zhang, Wenming [1 ,2 ]
机构
[1] Yanshan Univ, Coll Elect Engn, Qinhuangdao 066000, Hebei, Peoples R China
[2] Key Lab Ind Comp Control Engn Hebei Prov, Qinhuangdao 066000, Hebei, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Low-light transformation; Self-supervised learning; Jointly trained; Regularization signal;
D O I
10.1007/s40747-024-01681-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Low-light object detection involves identifying and locating objects in images captured under poor lighting conditions. It plays a significant role in surveillance and security, night pedestrian recognition, and autonomous driving, showcasing broad application prospects. Most existing object detection algorithms and datasets are designed for normal lighting conditions, leading to a significant drop in detection performance when applied to low-light environments. To address this issue, we propose a Low-Light Detection with Low-Light Enhancement (LDWLE) framework. LDWLE is an encoder-decoder architecture where the encoder transforms the raw input data into a compact, abstract representation (encoding), and the decoder gradually generates the target output format from the representation produced by the encoder. Specifically, during training, low-light images are input into the encoder, which produces feature representations that are decoded by two separate decoders: an object detection decoder and a low-light image enhancement decoder. Both decoders share the same encoder and are trained jointly. Throughout the training process, the two decoders optimize each other, guiding the low-light image enhancement towards improvements that benefit object detection. If the input image is normally lit, it first passes through a low-light image conversion module to be transformed into a low-light image before being fed into the encoder. If the input image is already a low-light image, it is directly input into the encoder. During the testing phase, the model can be evaluated in the same way as a standard object detection algorithm. Compared to existing object detection algorithms, LDWLE can train a low-light robust object detection model using standard, normally lit object detection datasets. Additionally, LDWLE is a versatile training framework that can be implemented on most one-stage object detection algorithms. These algorithms typically consist of three components: the backbone, neck, and head. In this framework, the backbone functions as the encoder, while the neck and head form the object detection decoder. Extensive experiments on the COCO, VOC, and ExDark datasets have demonstrated the effectiveness of LDWLE in low-light object detection. In quantitative measurements, it achieves an AP of 25.5 and 38.4 on the synthetic datasets COCO-d and VOC-d, respectively, and achieves the best AP of 30.5 on the real-world dataset ExDark. In qualitative measurements, LDWLE can accurately detect most objects on both public real-world low-light datasets and self-collected ones, demonstrating strong adaptability to varying lighting conditions and multi-scale objects.
引用
收藏
页数:18
相关论文
共 41 条
  • [1] A New Method for Commercial-Scale Water Purification Selection Using Linguistic Neural Networks
    Abdullah, Saleem
    Almagrabi, Alaa O.
    Ali, Nawab
    [J]. MATHEMATICS, 2023, 11 (13)
  • [2] Andrew H, 2019, C COMP VIS PATT REC
  • [3] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences
    Behley, Jens
    Garbade, Martin
    Milioto, Andres
    Quenzel, Jan
    Behnke, Sven
    Stachniss, Cyrill
    Gall, Juergen
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9296 - 9306
  • [4] Berg A.C., 2015, SSD SINGLE SHOT MULT, V9905, P21, DOI [10.1007/978-3-319-46448-02, DOI 10.1007/978-3-319-46448-02]
  • [5] Chan CS, 2023, Getting to know low-light images with the exclusively dark dataset
  • [6] Chen T, 2022, Graph representation learning for popularity prediction problem: a survey
  • [7] Learning Continuous Image Representation with Local Implicit Image Function
    Chen, Yinbo
    Liu, Sifei
    Wang, Xiaolong
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8624 - 8634
  • [8] Dai D, 2015, Is image super-resolution helpful for other vision tasks?, DOI [10.1109/wacv.2016.7477613, DOI 10.1109/WACV.2016.7477613]
  • [9] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [10] TURL: Table Understanding through Representation Learning
    Deng, Xiang
    Sun, Huan
    Lees, Alyssa
    Wu, You
    Yu, Cong
    [J]. SIGMOD RECORD, 2022, 51 (01) : 33 - 40