Lightweight Real-Time Semantic Segmentation Network With Efficient Transformer and CNN

被引:47
|
作者
Xu, Guoan [1 ,2 ]
Li, Juncheng [3 ,4 ]
Gao, Guangwei [1 ,2 ]
Lu, Huimin [5 ]
Yang, Jian [6 ]
Yue, Dong [1 ,2 ]
机构
[1] Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing, 210023, Peoples R China
[2] Soochow Univ, Prov Key Lab Comp Informat Proc Technol, Suzhou 215006, Peoples R China
[3] Shanghai Univ, Sch Commun & Informat Engn, Shanghai, Peoples R China
[4] Nanjing Univ Sci & Technol, Jiangsu Key Lab Image & Video Understanding Social, Nanjing 210049, Peoples R China
[5] Kyushu Inst Technol, Kitakyushu 8048550, Japan
[6] Nanjing Univ Sci & Technol, Sch Comp Sci & Technol, Nanjing 210049, Peoples R China
基金
中国国家自然科学基金;
关键词
Transformers; Convolution; Semantic segmentation; Task analysis; Convolutional neural networks; Computational modeling; Semantics; Real-time semantic segmentation; convolutional neural network; lightweight network; transformer;
D O I
10.1109/TITS.2023.3248089
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
In the past decade, convolutional neural networks (CNNs) have shown prominence for semantic segmentation. Although CNN models have very impressive performance, the ability to capture global representation is still insufficient, which results in suboptimal results. Recently, Transformer achieved huge success in NLP tasks, demonstrating its advantages in modeling long-range dependency. Recently, Transformer has also attracted tremendous attention from computer vision researchers who reformulate the image processing tasks as a sequence-to-sequence prediction but resulted in deteriorating local feature details. In this work, we propose a lightweight real-time semantic segmentation network called LETNet. LETNet combines a U-shaped CNN with Transformer effectively in a capsule embedding style to compensate for respective deficiencies. Meanwhile, the elaborately designed Lightweight Dilated Bottleneck (LDB) module and Feature Enhancement (FE) module cultivate a positive impact on training from scratch simultaneously. Extensive experiments performed on challenging datasets demonstrate that LETNet achieves superior performances in accuracy and efficiency balance. Specifically, It only contains 0.95M parameters and 13.6G FLOPs but yields 72.8% mIoU at 120 FPS on the Cityscapes test set and 70.5% mIoU at 250 FPS on the CamVid test dataset using a single RTX 3090 GPU. Source code will be available at https://github.com/IVIPLab/LETNet.
引用
收藏
页码:15897 / 15906
页数:10
相关论文
共 50 条
  • [1] Lightweight Asymmetric Dilation Network for Real-Time Semantic Segmentation
    Hu, Xuegang
    Gong, Yu
    IEEE ACCESS, 2021, 9 : 55630 - 55643
  • [2] EMSFomer: Efficient Multi-Scale Transformer for Real-Time Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    IEEE ACCESS, 2025, 13 : 18239 - 18252
  • [3] LDPNet: A Lightweight Densely Connected Pyramid Network for Real-Time Semantic Segmentation
    Hu, Xuegang
    Jing, Liyuan
    IEEE ACCESS, 2020, 8 : 212647 - 212658
  • [4] Lightweight and efficient feature fusion real-time semantic segmentation network
    Zhong, Jie
    Chen, Aiguo
    Jiang, Yizhang
    Sun, Chengcheng
    Peng, Yuheng
    IMAGE AND VISION COMPUTING, 2025, 154
  • [5] Lightweight and efficient asymmetric network design for real-time semantic segmentation
    Xiu-Ling Zhang
    Bing-Ce Du
    Zhao-Ci Luo
    Kai Ma
    Applied Intelligence, 2022, 52 : 564 - 579
  • [6] Lightweight and efficient asymmetric network design for real-time semantic segmentation
    Zhang, Xiu-Ling
    Du, Bing-Ce
    Luo, Zhao-Ci
    Ma, Kai
    APPLIED INTELLIGENCE, 2022, 52 (01) : 564 - 579
  • [7] LACTNet: A Lightweight Real-Time Semantic Segmentation Network Based on an Aggregated Convolutional Neural Network and Transformer
    Zhang, Xiangyue
    Li, Hexiao
    Ru, Jingyu
    Ji, Peng
    Wu, Chengdong
    ELECTRONICS, 2024, 13 (12)
  • [8] MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for Real-Time Semantic Segmentation
    Gao, Guangwei
    Xu, Guoan
    Yu, Yi
    Xie, Jin
    Yang, Jian
    Yue, Dong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 25489 - 25499
  • [9] ELUNet: an efficient and lightweight U-shape network for real-time semantic segmentation
    Ai, Yufeng
    Guo, Jichang
    Wang, Yudong
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (02)
  • [10] A Lightweight and Dynamic Convolutional Network for Real-time Semantic Segmentation
    Zhang, Chunyu
    Xu, Fang
    Wu, Chengdong
    2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 4062 - 4067