Laformer: Vision Transformer for Panoramic Image Semantic Segmentation

被引：4

作者：

Yuan, Zheng ^{[1
]}

Wang, Junhua ^{[3
]}

Lv, Yuxin ^{[2
]}

Wang, Ding ^{[2
]}

Fang, Yi ^{[2
]}

机构：

[1] Fudan Univ, Acad Engn & Technol, Shanghai 200433, Peoples R China

[2] Fudan Univ, Sch Informat Sci & Technol, Shanghai 200433, Peoples R China

[3] Fudan Univ, Inst Optoelect, Shanghai Frontiers Sci Res Base Intelligent Optoel, Shanghai 200438, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2023年 / 30卷

关键词：

Deformable convolution; panoramic images; prototype adaptation; self-training; semantic segmentation;

D O I：

10.1109/LSP.2023.3337716

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recent years have seen great advances in the area of semantic segmentation. However, general methods are targeted at pinhole images and tend to underperform when directly adopted to panoramic images. And with the wide applications of panoramic cameras, it is important to develop feasible approaches to train segmentation models for their real-time applications. To address this problem, we propose a novel method using self-training and achieve comparable results on DensePASS dataset. Namely, we propose a deformable merge module tailored for panoramic images by efficiently and accurately incorporating features of different levels. We design a novel prototype adaptation term that aids the model to better learn the class-wise feature embeddings of distorted objects. Finally, we use a simple and valid evaluation method to achieve real-time and improved inference performance. All combined, we can reach 58.27% of mIoU scores on DensePASS dataset and achieve new state of the art results.

引用

页码：1792 / 1796

页数：5

共 36 条

[1] Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks
Bousmalis, Konstantinos
Silberman, Nathan
Dohan, David
Erhan, Dumitru
Krishnan, Dilip
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 95 - 104
[2] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Chen, Liang-Chieh
Zhu, Yukun
Papandreou, George
Schroff, Florian
Adam, Hartwig
[J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
[3] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[4] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[5] Restricted Deformable Convolution-Based Road Scene Semantic Segmentation Using Surround View Cameras
Deng, Liuyuan
Yang, Ming
Li, Hao
Li, Tianyi
Hu, Bing
Wang, Chunxiang
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (10) : 4350 - 4362
[6] Deng LY, 2017, IEEE INT VEH SYM, P231, DOI 10.1109/IVS.2017.7995725
[7] Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation
He, Ruifei
Yang, Jihan
Qi, Xiaojuan
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6910 - 6920
[8] MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
Hoyer, Lukas
Dai, Dengxin
Wang, Haoran
Van Gool, Luc
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11721 - 11732
[9] HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation
Hoyer, Lukas
Dai, Dengxin
Van Gool, Luc
[J]. COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 372 - 391
[10] DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
Hoyer, Lukas
Dai, Dengxin
Van Gool, Luc
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9914 - 9925

← 1 2 3 4 →