DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation

被引:2
|
作者
Atif, Nadeem [1 ]
Mazhar, Saquib [1 ]
Ahamed, Shaik Rafi [1 ]
Bhuyan, M. K. [1 ]
Alfarhood, Sultan [2 ]
Safran, Mejdl [2 ]
机构
[1] Indian Inst Technol Guwahati, Gauhati 781039, Assam, India
[2] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh 11543, Saudi Arabia
关键词
Semantic segmentation; deep learning; real-time processing; autonomous driving; resource-constrained;
D O I
10.1109/ACCESS.2024.3359425
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In semantic segmentation, an input image is partitioned into multiple meaningful segments each corresponding to a specific object or region. Multi-scale context plays a vital role in the accurate recognition of objects of different sizes and hence is key to overall accuracy enhancement. To achieve this goal, we introduce a novel strategy called Distributed Multi-scale Pyramid Pooling (DMPP) to extract multi-scale context at multiple levels of feature hierarchy. More specifically, we employ Pyramid Pooling Modules (PPM) in a distributed fashion after all three stages during the encoding phase. This enhances the feature representation capability of the network and leads to better performance. To extract context at a more granular level, we propose an Efficient Multi-scale Context Aggregation (EMCA) module which uses a combination of small and large kernels with large and small dilation rates, respectively. This alleviates the problem of sparse sampling and leads to consistent recognition of different regions. Apart from model accuracy, small model size and efficient execution are critically important for real-time mobile applications. To achieve it, we employ a resource-friendly combination of depthwise and factorized convolutions in the EMCA module to drastically reduce the number of parameters without significantly compromising the accuracy. Based on the EMCA module and DMPP, we propose a lightweight and real-time Distributed Multi-scale Pyramid Network (DMPNet) that achieves an excellent accuracy-efficiency trade-off. We also conducted extensive experiments on both driving datasets (i.e., Cityscapes and CamVid) and a general-purpose dataset (i.e., ADE20K) to show the effectiveness of the proposed method.
引用
收藏
页码:16573 / 16585
页数:13
相关论文
共 50 条
  • [1] Feature pyramid network with multi-scale prediction fusion for real- time semantic segmentation
    Quyen, Toan Van
    Kim, Min Young
    NEUROCOMPUTING, 2023, 519 : 104 - 113
  • [2] A hybrid attention multi-scale fusion network for real-time semantic segmentation
    Ye, Baofeng
    Xue, Renzheng
    Wu, Qianlong
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [3] BOUNDARY CORRECTED MULTI-SCALE FUSION NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION
    Jiang, Tianjiao
    Jin, Yi
    Liang, Tengfei
    Wang, Xu
    Li, Yidong
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1886 - 1890
  • [4] A Lightweight Network with Multi-Scale Information Interaction Attention for Real-Time Semantic Segmentation
    Hu, Xuegang
    Xu, Shuhan
    FIFTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION, ICMV 2022, 2023, 12701
  • [5] MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for Real-Time Semantic Segmentation
    Gao, Guangwei
    Xu, Guoan
    Yu, Yi
    Xie, Jin
    Yang, Jian
    Yue, Dong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 25489 - 25499
  • [6] Lightweight multi-scale attention-guided network for real-time semantic segmentation
    Hu, Xuegang
    Liu, Yuanjing
    IMAGE AND VISION COMPUTING, 2023, 139
  • [7] EMSFomer: Efficient Multi-Scale Transformer for Real-Time Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    IEEE ACCESS, 2025, 13 : 18239 - 18252
  • [8] BSNet: A bilateral real-time semantic segmentation network based on multi-scale receptive fields
    Jin, Zhenyi
    Dou, Furong
    Feng, Ziliang
    Zhang, Chengfang
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 102
  • [9] EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation
    Hu, Xuegang
    Ke, Yan
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (02)
  • [10] EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation
    Xuegang Hu
    Yan Ke
    Journal of Real-Time Image Processing, 2024, 21