Laformer: Vision Transformer for Panoramic Image Semantic Segmentation

被引:4
作者
Yuan, Zheng [1 ]
Wang, Junhua [3 ]
Lv, Yuxin [2 ]
Wang, Ding [2 ]
Fang, Yi [2 ]
机构
[1] Fudan Univ, Acad Engn & Technol, Shanghai 200433, Peoples R China
[2] Fudan Univ, Sch Informat Sci & Technol, Shanghai 200433, Peoples R China
[3] Fudan Univ, Inst Optoelect, Shanghai Frontiers Sci Res Base Intelligent Optoel, Shanghai 200438, Peoples R China
关键词
Deformable convolution; panoramic images; prototype adaptation; self-training; semantic segmentation;
D O I
10.1109/LSP.2023.3337716
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recent years have seen great advances in the area of semantic segmentation. However, general methods are targeted at pinhole images and tend to underperform when directly adopted to panoramic images. And with the wide applications of panoramic cameras, it is important to develop feasible approaches to train segmentation models for their real-time applications. To address this problem, we propose a novel method using self-training and achieve comparable results on DensePASS dataset. Namely, we propose a deformable merge module tailored for panoramic images by efficiently and accurately incorporating features of different levels. We design a novel prototype adaptation term that aids the model to better learn the class-wise feature embeddings of distorted objects. Finally, we use a simple and valid evaluation method to achieve real-time and improved inference performance. All combined, we can reach 58.27% of mIoU scores on DensePASS dataset and achieve new state of the art results.
引用
收藏
页码:1792 / 1796
页数:5
相关论文
共 36 条
  • [1] Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks
    Bousmalis, Konstantinos
    Silberman, Nathan
    Dohan, David
    Erhan, Dumitru
    Krishnan, Dilip
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 95 - 104
  • [2] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [3] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [4] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [5] Restricted Deformable Convolution-Based Road Scene Semantic Segmentation Using Surround View Cameras
    Deng, Liuyuan
    Yang, Ming
    Li, Hao
    Li, Tianyi
    Hu, Bing
    Wang, Chunxiang
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (10) : 4350 - 4362
  • [6] Deng LY, 2017, IEEE INT VEH SYM, P231, DOI 10.1109/IVS.2017.7995725
  • [7] Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation
    He, Ruifei
    Yang, Jihan
    Qi, Xiaojuan
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6910 - 6920
  • [8] MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
    Hoyer, Lukas
    Dai, Dengxin
    Wang, Haoran
    Van Gool, Luc
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11721 - 11732
  • [9] HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation
    Hoyer, Lukas
    Dai, Dengxin
    Van Gool, Luc
    [J]. COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 372 - 391
  • [10] DAFormer: Improving Network Architectures and Training Strategies for Domain-Adaptive Semantic Segmentation
    Hoyer, Lukas
    Dai, Dengxin
    Van Gool, Luc
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 9914 - 9925