ETFT: Equiangular Tight Frame Transformer for Imbalanced Semantic Segmentation

被引：0

作者：

Jeong, Seonggyun ^{[1
]}

Heo, Yong Seok ^{[1
,2
]}

机构：

[1] Ajou Univ, Dept Artificial Intelligence, Suwon 16499, South Korea

[2] Ajou Univ, Dept Elect & Comp Engn, Suwon 16499, South Korea

来源：

SENSORS | 2024年 / 24卷 / 21期

基金：

新加坡国家研究基金会;

关键词：

semantic segmentation; neural collapse; class imbalance; transformer;

D O I：

10.3390/s24216913

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

Semantic segmentation often suffers from class imbalance, where the label ratio for each class in the dataset is not uniform. Recent studies have addressed the issue of class imbalance in semantic segmentation by leveraging the neural collapse phenomenon in conjunction with an Equiangular Tight Frame (ETF). While the use of ETF aids in enhancing the discriminability of minor classes, class correlation is another crucial factor that must be taken into account. However, managing the balance between class correlation and discrimination through neural collapse remains challenging, as these properties inherently conflict with one another. Moreover, this control is established during the training stage, resulting in a fixed classifier. There is no guarantee that this classifier will consistently perform well with different input images. To address this problem, we propose an Equiangular Tight Frame Transformer (ETFT), a transformer-based model that jointly processes the features and classifier using ETF structure, and dynamically generates the classifier as a function of the input for imbalanced semantic segmentation. Specifically, the classifier initialized with the ETF structure is jointly processed with the input patch tokens during the attention process. As a result, the transformed patch tokens, aided by the ETF structure, achieve discriminability between classes while preserving contextual correlation. The classifier, initially structured as an ETF, is adjusted to incorporate the correlation information, benefiting from the attention mechanism. Furthermore, the learned classifier is combined with the fixed ETF classifier, leveraging the advantages of both. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art methods for imbalanced semantic segmentation on both the ADE20K and Cityscapes datasets.

引用

页数：21

共 50 条

[31] MarsFormer: Martian Rock Semantic Segmentation With Transformer
Xiong, Yonggang
Xiao, Xueming
Yao, Meibao
Liu, Haiqiang
Yang, Hong
Fu, Yuegang
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[32] CoT: Contourlet Transformer for Hierarchical Semantic Segmentation
Shao, Yilin
Sun, Long
Jiao, Licheng
Liu, Xu
Liu, Fang
Li, Lingling
Yang, Shuyuan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 132 - 146
[33] DGFormer: A Dynamic Kernel with Gaussian Fusion Transformer for Semantic Image Segmentation
Yang, Haoran
Tang, Longyi
Wu, Tingting
Yan, Binyu
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT III, 2024, 15018 : 17 - 30
[34] A Transformer-based Semantic Segmentation Model for Street Fashion Images
Peng, Dingjie
Kameyama, Wataru
INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2023, 2023, 12592
[35] Semantic Segmentation of UAV Images Based on Transformer Framework with Context Information
Kumar, Satyawant
Kumar, Abhishek
Lee, Dong-Gyu
MATHEMATICS, 2022, 10 (24)
[36] Enhancing Multiscale Representations With Transformer for Remote Sensing Image Semantic Segmentation
Xiao, Tao
Liu, Yikun
Huang, Yuwen
Li, Mingsong
Yang, Gongping
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[37] TRANSFORMER-BASED METHOD FOR SEMANTIC SEGMENTATION AND RECONSTRUCTION OF THE MARTIAN SURFACE
Li, Z.
Wu, B.
Chen, Z.
Ma, Y.
GEOSPATIAL WEEK 2023, VOL. 48-1, 2023, : 1643 - 1649
[38] Laformer: Vision Transformer for Panoramic Image Semantic Segmentation
Yuan, Zheng
Wang, Junhua
Lv, Yuxin
Wang, Ding
Fang, Yi
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1792 - 1796
[39] A reversible transformer for LiDAR point cloud semantic segmentation
Akwensi, Perpertual Hope
Wang, Ruisheng
2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 19 - 28
[40] HSPFormer: Hierarchical Spatial Perception Transformer for Semantic Segmentation
Chen, Siyu
Han, Ting
Zhang, Changshe
Su, Jinhe
Wang, Ruisheng
Chen, Yiping
Wang, Zongyue
Cai, Guorong
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,

← 1 2 3 4 5 →