Unsupervised Cross-Spectrum Depth Estimation by Visible-Light and Thermal Cameras

被引:1
|
作者
Guo, Yubin [1 ]
Qi, Xinlei [1 ]
Xie, Jin [1 ]
Xu, Cheng-Zhong [2 ]
Kong, Hui [3 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Jiangsu, Peoples R China
[2] Univ Macau, Dept Comp Sci, State Key Lab Internet Things Smart City SKL IOTSC, Macau, Peoples R China
[3] Univ Macau, Dept Electromech Engn EME, State Key Lab Internet Things Smart City SKL IOTSC, Macau, Peoples R China
关键词
Index Terms-Unsupervised learning; transfer learning; mul-tispectral imaging; computer vision;
D O I
10.1109/TITS.2023.3279559
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Cross-spectrum depth estimation aims to provide a reliable depth map under variant-illumination conditions with a pair of dual-spectrum images. It is valuable for autonomous driving applications when vehicles are equipped with two cameras of different modalities. However, images captured by different-modality cameras can be photometrically quite different, which makes cross-spectrum depth estimation a very challenging problem. Moreover, the shortage of large-scale open-source datasets also retards further research in this field. In this paper, we propose an unsupervised visible light(VIS)-image-guided cross-spectrum (i.e., thermal and visible-light, TIR-VIS in short) depth-estimation framework. The input of the framework consists of a cross-spectrum stereo pair (one VIS image and one thermal image). First, we train a depth-estimation base network using VIS-image stereo pairs. To adapt the trained depth-estimation network to the cross-spectrum images, we propose a multi-scale feature-transfer network to transfer features from the TIR domain to the VIS domain at the feature level. Furthermore, we introduce a mechanism of cross-spectrum depth cycle-consistency to improve the depth estimation result of dual-spectrum image pairs. Meanwhile, we release to society a large cross-spectrum dataset with visible-light and thermal stereo images captured in different scenes. The experiment result shows that our method achieves better depth-estimation results than the compared existing methods. Our code and dataset are available on https://github.com/whitecrow1027/CrossSP_Depth.
引用
收藏
页码:10937 / 10947
页数:11
相关论文
共 50 条
  • [31] Wavelet-based depth map estimation for light field cameras
    1600, Institute of Electrical and Electronics Engineers Inc., United States
  • [32] DEPTH ESTIMATION BY ANALYZING INTENSITY DISTRIBUTION FOR LIGHT-FIELD CAMERAS
    Xu, Yatong
    Jin, Xin
    Dai, Qionghai
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 3540 - 3544
  • [33] Enhanced Depth Map Estimation in Low Light Conditions for RGB Cameras
    Chang, Joseph
    Nguyen, Truong Q.
    18TH INTERNATIONAL SOC DESIGN CONFERENCE 2021 (ISOCC 2021), 2021, : 21 - 22
  • [34] Depth Estimation with Occlusion Modeling Using Light-Field Cameras
    Wang, Ting-Chun
    Efros, Alexei A.
    Ramamoorthi, Ravi
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (11) : 2170 - 2181
  • [35] Wavelet-Based Depth Map Estimation for Light Field Cameras
    Mishiba, Kazu
    Oyamada, Yuji
    Kondo, Katsuya
    2016 IEEE 5TH GLOBAL CONFERENCE ON CONSUMER ELECTRONICS, 2016,
  • [36] Depth Map Estimation Using Census Transform for Light Field Cameras
    Tomioka, Takayuki
    Mishiba, Kazu
    Oyamada, Yuji
    Kondo, Katsuya
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (11) : 2711 - 2720
  • [37] ENHANCED DEPTH ESTIMATION FOR HAND-HELD LIGHT FIELD CAMERAS
    Qin, Yanwen
    Jin, Xin
    Chen, Yanqin
    Dai, Qionghai
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2032 - 2036
  • [38] CROSS-SPECTRUM AND COHERENCE FUNCTION ESTIMATION USING TIME-DELAYED THOMSON MULTITAPERS
    Hansson-Sandsten, Maria
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4240 - 4243
  • [39] Human segmentation by geometrically fusing visible-light and thermal imageries
    Jian Zhao
    Sen-ching S. Cheung
    Multimedia Tools and Applications, 2014, 73 : 61 - 89
  • [40] Human segmentation by geometrically fusing visible-light and thermal imageries
    Zhao, Jian
    Cheung, Sen-ching S.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 73 (01) : 61 - 89