HEAL-SWIN: A Vision Transformer On The Sphere

被引:2
|
作者
Carlsson, Oscar [1 ]
Gerken, Jan E. [1 ]
Linander, Hampus [1 ]
Spiess, Heiner [2 ]
Ohlsson, Fredrik [3 ]
Petersson, Christoffer [1 ,4 ]
Persson, Daniel [1 ]
机构
[1] Univ Gothenburg, Chalmers Univ Technol, Dept Math Sci, SE-41296 Gothenburg, Sweden
[2] Tech Univ Berlin, Neural Informat Proc, Sci Intelligence, DE-10623 Berlin, Germany
[3] Umea Univ, Dept Math & Math Stat, Umea degrees, SE-90187 Umea, Sweden
[4] Zenseact, SE-41756 Gothenburg, Sweden
来源
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024 | 2024年
基金
瑞典研究理事会;
关键词
DATASET;
D O I
10.1109/CVPR52733.2024.00580
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to projection and distortion losses introduced when projecting to a rectangular grid on the plane. We introduce the HEAL-SWIN transformer, which combines the highly uniform Hierarchical Equal Area iso-Latitude Pixelation (HEALPix) grid used in astrophysics and cosmology with the Hierarchical Shifted-Window (SWIN) transformer to yield an efficient and flexible model capable of training on high-resolution, distortion-free spherical data. In HEAL-SWIN, the nested structure of the HEALPix grid is used to perform the patching and windowing operations of the SWIN transformer, enabling the network to process spherical representations with minimal computational overhead. We demonstrate the superior performance of our model on both synthetic and real automotive datasets, as well as a selection of other image datasets, for semantic segmentation, depth regression and classification tasks. Our code is publicly available(1).
引用
收藏
页码:6067 / 6077
页数:11
相关论文
共 50 条
  • [1] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
    Liu, Ze
    Lin, Yutong
    Cao, Yue
    Hu, Han
    Wei, Yixuan
    Zhang, Zheng
    Lin, Stephen
    Guo, Baining
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
  • [2] PolySegNet: improving polyp segmentation through swin transformer and vision transformer fusion
    Lijin, P.
    Ullah, Mohib
    Vats, Anuja
    Cheikh, Faouzi Alaya
    Kumar, G. Santhosh
    Nair, Madhu S.
    BIOMEDICAL ENGINEERING LETTERS, 2024, 14 (06) : 1421 - 1431
  • [3] Image recoloring for color vision deficiency compensation using Swin transformer
    Chen, Ligeng
    Zhu, Zhenyang
    Huang, Wangkang
    Go, Kentaro
    Chen, Xiaodiao
    Mao, Xiaoyang
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (11): : 6051 - 6066
  • [4] Image recoloring for color vision deficiency compensation using Swin transformer
    Ligeng Chen
    Zhenyang Zhu
    Wangkang Huang
    Kentaro Go
    Xiaodiao Chen
    Xiaoyang Mao
    Neural Computing and Applications, 2024, 36 : 6051 - 6066
  • [5] Performance of vision transformer and swin transformer models for lemon quality classification in fruit juice factories
    Dumen, Sezer
    Yilmaz, Esra Kavalci
    Adem, Kemal
    Avaroglu, Erdinc
    EUROPEAN FOOD RESEARCH AND TECHNOLOGY, 2024, 250 (09) : 2291 - 2302
  • [6] Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer
    Song, Bofan
    Raj, Dharma K. C.
    Yang, Rubin Yuchan
    Li, Shaobai
    Zhang, Chicheng
    Liang, Rongguang
    CANCERS, 2024, 16 (05)
  • [7] Video Swin Transformer
    Liu, Ze
    Ning, Jia
    Cao, Yue
    Wei, Yixuan
    Zhang, Zheng
    Lin, Stephen
    Hu, Han
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3192 - 3201
  • [8] Enhancing Melanoma Diagnosis with Advanced Deep Learning Models Focusing on Vision Transformer, Swin Transformer, and ConvNeXt
    Aksoy, Serra
    Demircioglu, Pinar
    Bogrekci, Ismail
    DERMATOPATHOLOGY, 2024, 11 (03): : 239 - 252
  • [9] Random Swin Transformer
    Choi, Keong-Hun
    Ha, Jong-Eun
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1611 - 1614
  • [10] SparseSwin: Swin transformer with sparse transformer block
    Pinasthika, Krisna
    Laksono, Blessius Sheldo Putra
    Irsal, Riyandi Banovbi Putera
    Shabiyya, Syifa Hukma
    Yudistira, Novanto
    NEUROCOMPUTING, 2024, 580