CoT: Contourlet Transformer for Hierarchical Semantic Segmentation

被引:0
|
作者
Shao, Yilin [1 ]
Sun, Long [1 ]
Jiao, Licheng [1 ]
Liu, Xu [1 ]
Liu, Fang [1 ]
Li, Lingling [1 ]
Yang, Shuyuan [1 ]
机构
[1] Xidian Univ, Sch Artificial Intelligence, Int Res Ctr Intelligent Percept & Computat, Minist Educ China,Key Lab Intelligent Percept & I, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Transformers; Semantics; Semantic segmentation; Task analysis; Computed tomography; Convolutional neural networks; Contourlet transform (CT); semantic segmentation; sparse convolution; Transformer-convolutional neural network (CNN) hybrid model;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Transformer-convolutional neural network (CNN) hybrid learning approach is gaining traction for balancing deep and shallow image features for hierarchical semantic segmentation. However, they are still confronted with a contradiction between comprehensive semantic understanding and meticulous detail extraction. To solve this problem, this article proposes a novel Transformer-CNN hybrid hierarchical network, dubbed contourlet transformer (CoT). In the CoT framework, the semantic representation process of the Transformer is unavoidably peppered with sparsely distributed points that, while not desired, demand finer detail. Therefore, we design a deep detail representation (DDR) structure to investigate their fine-grained features. First, through contourlet transform (CT), we distill the high-frequency directional components from the raw image, yielding localized features that accommodate the inductive bias of CNN. Second, a CNN deep sparse learning (DSL) module takes them as input to represent the underlying detailed features. This memory- and energy-efficient learning method can keep the same sparse pattern between input and output. Finally, the decoder hierarchically fuses the detailed features with the semantic features via an image reconstruction-like fashion. Experiments demonstrate that CoT achieves competitive performance on three benchmark datasets: PASCAL Context [57.21% mean intersection over union (mIoU)], ADE20K (54.16% mIoU), and Cityscapes (84.23% mIoU). Furthermore, we conducted robustness studies to validate its resistance against various sorts of corruption. Our code is available at: https://github.com/yilinshao/CoT-Contourlet-Transformer.
引用
收藏
页码:132 / 146
页数:15
相关论文
共 50 条
  • [41] STN: Saliency-Guided Transformer Network for Point-Wise Semantic Segmentation of Urban Scenes
    Ma, Lingfei
    Li, Jonathan
    Guan, Haiyan
    Yu, Yongtao
    Chen, Yiping
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [42] CMLFormer: CNN and Multiscale Local-Context Transformer Network for Remote Sensing Images Semantic Segmentation
    Wu, Honglin
    Zhang, Min
    Huang, Peng
    Tang, Wenlong
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 7233 - 7241
  • [43] ELiFormer: A hierarchical Transformer based Model with Efficient Encoder and Lightweight Decoder for Semantic Segmentation
    Wu, Zixuan
    Zhou, Yue
    2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,
  • [44] FTransDeepLab: Multimodal Fusion Transformer-Based DeepLabv3+for Remote Sensing Semantic Segmentation
    Feng, Haixia
    Hu, Qingwu
    Zhao, Pengcheng
    Wang, Shunli
    Ai, Mingyao
    Zheng, Daoyuan
    Liu, Tiancheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [45] A Hierarchical Loss for Semantic Segmentation
    Muller, Bruce
    Smith, William
    VISAPP: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 4: VISAPP, 2020, : 260 - 267
  • [46] A Novel Semantic Segmentation Algorithm Using a Hierarchical Adjacency Dependent Network
    Li, Jianjun
    Yu, Jie
    Yang, Dan
    Tian, Wanyong
    Zhao, Lulu
    Hu, Junfeng
    IEEE ACCESS, 2019, 7 : 150444 - 150452
  • [47] A reversible transformer for LiDAR point cloud semantic segmentation
    Akwensi, Perpertual Hope
    Wang, Ruisheng
    2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 19 - 28
  • [48] TransDeep: Transformer-Integrated DeepLabV3+for Image Semantic Segmentation
    Chai, Tengfei
    Xiao, Zhiguo
    Shen, Xiangfeng
    Liu, Qian
    Li, Nianfeng
    Guan, Tong
    Tian, Jia
    IEEE ACCESS, 2025, 13 : 6277 - 6291
  • [49] AAFormer: Attention-Attended Transformer for Semantic Segmentation of Remote Sensing Images
    Li, Xin
    Xu, Feng
    Li, Linyang
    Xu, Nan
    Liu, Fan
    Yuan, Chi
    Chen, Ziqi
    Lyu, Xin
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 5
  • [50] Polarized Attention Weak Supervised Semantic Segmentation Network
    Dai, Min
    Wu, Donghang
    Dawei, Yang
    IEEE ACCESS, 2024, 12 : 53965 - 53973