CoT: Contourlet Transformer for Hierarchical Semantic Segmentation

被引：0

作者：

Shao, Yilin ^{[1
]}

Sun, Long ^{[1
]}

Jiao, Licheng ^{[1
]}

Liu, Xu ^{[1
]}

Liu, Fang ^{[1
]}

Li, Lingling ^{[1
]}

Yang, Shuyuan ^{[1
]}

机构：

[1] Xidian Univ, Sch Artificial Intelligence, Int Res Ctr Intelligent Percept & Computat, Minist Educ China,Key Lab Intelligent Percept & I, Xian 710071, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2025年 / 36卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Transformers; Semantics; Semantic segmentation; Task analysis; Computed tomography; Convolutional neural networks; Contourlet transform (CT); semantic segmentation; sparse convolution; Transformer-convolutional neural network (CNN) hybrid model;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Transformer-convolutional neural network (CNN) hybrid learning approach is gaining traction for balancing deep and shallow image features for hierarchical semantic segmentation. However, they are still confronted with a contradiction between comprehensive semantic understanding and meticulous detail extraction. To solve this problem, this article proposes a novel Transformer-CNN hybrid hierarchical network, dubbed contourlet transformer (CoT). In the CoT framework, the semantic representation process of the Transformer is unavoidably peppered with sparsely distributed points that, while not desired, demand finer detail. Therefore, we design a deep detail representation (DDR) structure to investigate their fine-grained features. First, through contourlet transform (CT), we distill the high-frequency directional components from the raw image, yielding localized features that accommodate the inductive bias of CNN. Second, a CNN deep sparse learning (DSL) module takes them as input to represent the underlying detailed features. This memory- and energy-efficient learning method can keep the same sparse pattern between input and output. Finally, the decoder hierarchically fuses the detailed features with the semantic features via an image reconstruction-like fashion. Experiments demonstrate that CoT achieves competitive performance on three benchmark datasets: PASCAL Context [57.21% mean intersection over union (mIoU)], ADE20K (54.16% mIoU), and Cityscapes (84.23% mIoU). Furthermore, we conducted robustness studies to validate its resistance against various sorts of corruption. Our code is available at: https://github.com/yilinshao/CoT-Contourlet-Transformer.

引用

页码：132 / 146

页数：15

共 50 条

[1] CoT: Contourlet Transformer for Hierarchical Semantic Segmentation
Shao, Yilin
Sun, Long
Jiao, Licheng
Liu, Xu
Liu, Fang
Li, Lingling
Yang, Shuyuan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, : 1 - 15
[2] HSPFormer: Hierarchical Spatial Perception Transformer for Semantic Segmentation
Chen, Siyu
Han, Ting
Zhang, Changshe
Su, Jinhe
Wang, Ruisheng
Chen, Yiping
Wang, Zongyue
Cai, Guorong
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,
[3] Pyramid Fusion Transformer for Semantic Segmentation
Qin, Zipeng
Liu, Jianbo
Zhang, Xiaolin
Tian, Maoqing
Zhou, Aojun
Yi, Shuai
Li, Hongsheng
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9630 - 9643
[4] Enhancing Multiscale Representations With Transformer for Remote Sensing Image Semantic Segmentation
Xiao, Tao
Liu, Yikun
Huang, Yuwen
Li, Mingsong
Yang, Gongping
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[5] Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation
He, Xin
Zhou, Yong
Zhao, Jiaqi
Zhang, Di
Yao, Rui
Xue, Yong
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[6] TransRVNet: LiDAR Semantic Segmentation With Transformer
Cheng, Hui-Xian
Han, Xian-Feng
Xiao, Guo-Qiang
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (06) : 5895 - 5907
[7] Scene sketch semantic segmentation with hierarchical Transformer
Yang, Jie
Ke, Aihua
Yu, Yaoxiang
Cai, Bo
KNOWLEDGE-BASED SYSTEMS, 2023, 280
[8] MMSFormer: Multimodal Transformer for Material and Semantic Segmentation
Reza, Md Kaykobad
Prater-Bennette, Ashley
Asif, M. Salman
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 599 - 610
[9] Combining Swin Transformer With UNet for Remote Sensing Image Semantic Segmentation
Fan, Lili
Zhou, Yu
Liu, Hongmei
Li, Yunjie
Cao, Dongpu
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
[10] Class-Prompting Transformer for Incremental Semantic Segmentation
Song, Zichen
Shi, Zhaofeng
Shang, Chao
Meng, Fanman
Xu, Linfeng
IEEE ACCESS, 2023, 11 : 100154 - 100164

← 1 2 3 4 5 →