Swin transformer and ResNet based deep networks for low-light image enhancement

被引:8
作者
Xu, Lintao [1 ,2 ]
Hu, Changhui [1 ,2 ]
Zhang, Bo [1 ,2 ]
Wu, Fei [1 ,2 ]
Cai, Ziyun [1 ,2 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Automat, Wenyuan Rd, Nanjing 210023, Jiangsu, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Coll Artificial Intelligence, Wenyuan Rd, Nanjing 210023, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Low-light image enhancement; Generative adversarial network; Swin transformer; Random paired learning; QUALITY ASSESSMENT; RETINEX;
D O I
10.1007/s11042-023-16650-w
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Low-light image enhancement is a long-term low-level vision problem, which aims to improve the visual quality of images captured in low illumination environment. Convolutional neural network (CNN) is the foundation of the majority of low-light image enhancement algorithms now. The limitations of CNN receptive field lead to the inability to establish long-range context interaction. In recent years, Transformer has received increasing attention in computer vision due to its global attention. In this paper, we design the Swin Transformer and ResNet-based Generative Adversarial Network (STRN) for low-light image enhancement by combining the advantages of ResNet and the Swin Transformer. The STRN consists of a U-shaped generator and multiscale discriminators. The generator is composed of a shallow feature extraction, a deep feature extraction, and an image reconstruction module. To calculate the global and local attention, we alternately use Swin Transformer blocks and ResNet in the deep feature processing module. The self perceptual loss and the spatial consistency loss are employed to constrain the random paired training of STRN. The experimental results on benchmark datasets and real-world low-light images demonstrate that the proposed STRN achieves state-of-the-art performance on low-light image enhancement tasks in terms of visual quality and evaluation metrics.
引用
收藏
页码:26621 / 26642
页数:22
相关论文
共 57 条
[31]   Multi-label image recognition with attentive transformer-localizer module [J].
Nie, Lin ;
Chen, Tianshui ;
Wang, Zhouxia ;
Kang, Wenxiong ;
Lin, Liang .
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (06) :7917-7940
[32]   Distort-and-Recover: Color Enhancement using Deep Reinforcement Learning [J].
Park, Jongchan ;
Lee, Joon-Young ;
Yoo, Donggeun ;
Kweon, In So .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :5928-5936
[33]   Vision-based surveillance system for monitoring traffic conditions [J].
Park, Man-Woo ;
Kim, Jung In ;
Lee, Young-Joo ;
Park, Jinwoo ;
Suh, Wonho .
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (23) :25343-25367
[34]  
Pizer S. M., 1990, Proceedings of the First Conference on Visualization in Biomedical Computing (Cat. No.90TH0311-1), P337, DOI 10.1109/VBC.1990.109340
[35]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[36]  
Vaswani A, 2017, ADV NEUR IN, V30
[37]   LSV-LP: Large-Scale Video-Based License Plate Detection and Recognition [J].
Wang, Qi ;
Lu, Xiaocheng ;
Zhang, Cong ;
Yuan, Yuan ;
Li, Xuelong .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (01) :752-767
[38]   Real-Time Spatiotemporal Spectral Unmixing of MODIS Images [J].
Wang, Qunming ;
Ding, Xinyu ;
Tong, Xiaohua ;
Atkinson, Peter M. .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[39]  
Wang S., 2021, IEEE Trans Geosci Remote Sens, V60, P1, DOI [10.1109/TGRS.2021.3057721, DOI 10.1109/TGRS.2021.3057721]
[40]   Naturalness Preserved Enhancement Algorithm for Non-Uniform Illumination Images [J].
Wang, Shuhang ;
Zheng, Jin ;
Hu, Hai-Miao ;
Li, Bo .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (09) :3538-3548