BENet: boundary-enhanced network for real-time semantic segmentation

被引:1
作者
Lei, Xiaochun [1 ,2 ]
Chen, Zeyu [1 ]
Yu, Zhaoxin [1 ]
Jiang, Zetao [1 ,2 ]
机构
[1] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541010, Guangxi, Peoples R China
[2] Guilin Univ Elect Technol, Guangxi Key Lab Image & G Intelligent Proc, Guilin 541004, Guangxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Semantic segmentation; Deep neural networks; Real-time inference; Boundary-enhanced;
D O I
10.1007/s00371-024-03320-7
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the realm of real-time semantic segmentation, deep neural networks have demonstrated promising potential. However, current methods face challenges when it comes to accurately segmenting object boundaries and small objects. This limitation is partly attributed to the prevalence of convolutional neural networks, which often involve multiple sequential down-sampling operations, resulting in the loss of fine-grained details. To overcome this drawback, we introduce BENet, a real-time semantic segmentation network with a focus on enhancing object boundaries. The proposed BENet integrates two key components: the boundary extraction module (BEM) and the boundary adaption layer (BAL). The proposed BEM efficiently extracts boundary information, while the BAL guides the network using this information to preserve intricate details during the feature extraction process. Furthermore, to address the challenges associated with poor segmentation of elongated objects, we introduce the strip mixed aggregation pyramid pooling module (SMAPPM). This module employs strip pooling kernels to effectively expand the contextual representation and receptive field of the network, thereby enhancing overall segmentation performance. Our experiments conducted on a single RTX 3090 GPU show that our method achieves an mIoU of 79.4% at a speed of 45.5 FPS on the Cityscapes test set without ImageNet pre-training.
引用
收藏
页码:229 / 241
页数:13
相关论文
共 56 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]   Semantic object classes in video: A high-definition ground truth database [J].
Brostow, Gabriel J. ;
Fauqueur, Julien ;
Cipolla, Roberto .
PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97
[4]  
Chen LC, 2016, Arxiv, DOI arXiv:1412.7062
[5]  
Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
[6]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[7]   Image Manipulation Detection by Multi-View Multi-Scale Supervision [J].
Chen, Xinru ;
Dong, Chengbo ;
Ji, Jiaqi ;
Cao, Juan ;
Li, Xirong .
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :14165-14173
[8]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[9]   Deformable Convolutional Networks [J].
Dai, Jifeng ;
Qi, Haozhi ;
Xiong, Yuwen ;
Li, Yi ;
Zhang, Guodong ;
Hu, Han ;
Wei, Yichen .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :764-773
[10]   Tooth instance segmentation based on capturing dependencies and receptive field adjustment in cone beam computed tomography [J].
Dou, Wenhan ;
Gao, Shanshan ;
Mao, Deqian ;
Dai, Honghao ;
Zhang, Chenhao ;
Zhou, Yuanfeng .
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (05)