The Potential of Diffusion-Based Near-Infrared Image Colorization

被引:1
作者
Borstelmann, Ayk [1 ]
Haucke, Timm [1 ,2 ]
Steinhage, Volker [1 ]
机构
[1] Univ Bonn, Inst Comp Sci 4, Friedrich Hirzebruch Allee 8, D-53115 Bonn, Germany
[2] MIT, Comp Sci & Artificial Intelligence Lab, 32 Vassar St, Cambridge, MA 02139 USA
关键词
near-infrared; diffusion models; camera trapping; unpaired dataset; neural networks; machine learning;
D O I
10.3390/s24051565
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Camera traps, an invaluable tool for biodiversity monitoring, capture wildlife activities day and night. In low-light conditions, near-infrared (NIR) imaging is commonly employed to capture images without disturbing animals. However, the reflection properties of NIR light differ from those of visible light in terms of chrominance and luminance, creating a notable gap in human perception. Thus, the objective is to enrich near-infrared images with colors, thereby bridging this domain gap. Conventional colorization techniques are ineffective due to the difference between NIR and visible light. Moreover, regular supervised learning methods cannot be applied because paired training data are rare. Solutions to such unpaired image-to-image translation problems currently commonly involve generative adversarial networks (GANs), but recently, diffusion models gained attention for their superior performance in various tasks. In response to this, we present a novel framework utilizing diffusion models for the colorization of NIR images. This framework allows efficient implementation of various methods for colorizing NIR images. We show NIR colorization is primarily controlled by the translation of the near-infrared intensities to those of visible light. The experimental evaluation of three implementations with increasing complexity shows that even a simple implementation inspired by visible-near-infrared (VIS-NIR) fusion rivals GANs. Moreover, we show that the third implementation is capable of outperforming GANs. With our study, we introduce an intersection field joining the research areas of diffusion models, NIR colorization, and VIS-NIR fusion.
引用
收藏
页数:21
相关论文
共 46 条
[21]   Colorizing Near Infrared Images through a Cyclic Adversarial Approach of Unpaired Samples [J].
Mehri, Armin ;
Sappa, Angel D. .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :971-979
[22]  
Metz L, 2017, Arxiv, DOI [arXiv:1611.02163, DOI 10.48550/ARXIV.1611.02163]
[23]   Making a "Completely Blind" Image Quality Analyzer [J].
Mittal, Anish ;
Soundararajan, Rajiv ;
Bovik, Alan C. .
IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (03) :209-212
[24]   Camera trapping expands the view into global biodiversity and its change [J].
Oliver, Ruth Y. ;
Iannarilli, Fabiola ;
Ahumada, Jorge ;
Fegraus, Eric ;
Flores, Nicole ;
Kays, Roland ;
Birch, Tanya ;
Ranipeta, Ajay ;
Rogan, Matthew S. ;
Sica, Yanina V. ;
Jetz, Walter .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2023, 378 (1881)
[25]   Innovations in movement and behavioural ecology from camera traps: Day range as model parameter [J].
Palencia, Pablo ;
Fernandez-Lopez, Javier ;
Vicente, Joaquin ;
Acevedo, Pelayo .
METHODS IN ECOLOGY AND EVOLUTION, 2021, 12 (07) :1201-1212
[26]   On Aliased Resizing and Surprising Subtleties in GAN Evaluation [J].
Parmar, Gaurav ;
Zhang, Richard ;
Zhu, Jun-Yan .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :11400-11410
[27]  
Renwu Gao, 2020, Pattern Recognition and Artificial Intelligence. International Conference, ICPRAI 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12068), P453, DOI 10.1007/978-3-030-59830-3_39
[28]   High-Resolution Image Synthesis with Latent Diffusion Models [J].
Rombach, Robin ;
Blattmann, Andreas ;
Lorenz, Dominik ;
Esser, Patrick ;
Ommer, Bjoern .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :10674-10685
[29]   U-Net: Convolutional Networks for Biomedical Image Segmentation [J].
Ronneberger, Olaf ;
Fischer, Philipp ;
Brox, Thomas .
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, PT III, 2015, 9351 :234-241
[30]  
Saharia C., 2022, P ACM SIGGRAPH 2022, P1