A transformer-CNN parallel network for image guided depth completion

被引：5

作者：

Li, Tao ^{[1
]}

Dong, Xiucheng ^{[1
]}

Lin, Jie ^{[2
]}

Peng, Yonghong ^{[3
]}

机构：

[1] Xihua Univ, Sch Elect Engn & Elect Informat, Chengdu 610039, Peoples R China

[2] Xihua Univ, Sch Aeronaut & Astronaut, Chengdu 610039, Peoples R China

[3] Manchester Metropolitan Univ, Dept Comp & Math, Manchester M1 5GD, England

来源：

PATTERN RECOGNITION | 2024年 / 150卷

基金：

中国国家自然科学基金;

关键词：

Depth completion; Convolutional neural network; Transformer; Token correlation; Conditional random field;

D O I：

10.1016/j.patcog.2024.110305

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Image guided depth completion aims to predict a dense depth map from sparse depth measurements and the corresponding single color image. However, most state-of-the-art methods only rely on convolutional neural network (CNN) or transformer. In this paper, we propose a transformer -CNN parallel network (TCPNet) to integrate the advantages of CNN in local detail recovery and transformer in long-range semantic modeling. Specifically, our CNN branch adopts dense connection to strengthen feature propagation. Since the common transformer computes self -attention based on all the tokens in the window, no matter if they are relevant or not, this will inevitably introduce interferences and noises. To improve the self -attention accuracy, we propose a correlation -based transformer to only allow nearest neighbor tokens to participate in the self -attention computation. We also design a multi -scale conditional random field (CRF) module to implement multi -scale high -dimensional filtering for depth refinement. The comprehensive experimental results on KITTI and NYUv2 demonstrate that our method outperforms the state-of-the-art methods.

引用

页数：11

共 50 条

[1] Parallel Transformer-CNN Model for Medical Image Segmentation
Zhou, Mingkun
Nie, Xueyun
Liu, Yuhang
Li, Doudou
2024 5TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATION, ICCEA 2024, 2024, : 1048 - 1051
[2] TCPCNet: a transformer-CNN parallel cooperative network for low-light image enhancement
Wanjun Zhang
Yujie Ding
Miaohui Zhang
Yonghua Zhang
Lvchen Cao
Ziqing Huang
Jun Wang
Multimedia Tools and Applications, 2024, 83 : 52957 - 52972
[3] An Efficient Transformer-CNN Network for Document Image Binarization
Zhang, Lina
Wang, Kaiyuan
Wan, Yi
ELECTRONICS, 2024, 13 (12)
[4] TCPCNet: a transformer-CNN parallel cooperative network for low-light image enhancement
Zhang, Wanjun
Ding, Yujie
Zhang, Miaohui
Zhang, Yonghua
Cao, Lvchen
Huang, Ziqing
Wang, Jun
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (17) : 52957 - 52972
[5] CSegNet: a hybrid transformer-CNN network for road crack image segmentation
Dong, Hao
Du, Yinlai
Feng, Dong
Hu, Qingyuan
Zhou, Mingzhu
Xing, Jun
Zhang, Long
Wang, Shu
Liu, Yong
INSIGHT, 2024, 66 (12) : 737 - 746
[6] A transformer-CNN for deep image inpainting forensics
Zhu, Xinshan
Lu, Junyan
Ren, Honghao
Wang, Hongquan
Sun, Biao
VISUAL COMPUTER, 2023, 39 (10): : 4721 - 4735
[7] Transformer-CNN hybrid network for crowd counting
Yu J.
Yu Y.
Qian J.
Han X.
Zhu F.
Zhu Z.
Journal of Intelligent and Fuzzy Systems, 2024, 46 (04): : 10773 - 10785
[8] Transformer-CNN for small image object detection
Chen, Yan-Lin
Lin, Chun-Liang
Lin, Yu-Chen
Chen, Tzu-Chun
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2024, 129
[9] Hybrid Transformer-CNN for Real Image Denoising
Zhao, Mo
Cao, Gang
Huang, Xianglin
Yang, Lifang
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 1252 - 1256
[10] Dual branch Transformer-CNN parametric filtering network for underwater image enhancement
Chang, Baocai
Li, Jinjiang
Ren, Lu
Chen, Zheng
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 100

← 1 2 3 4 5 →