EGSNet: An Efficient Glass Segmentation Network Based on Multi-Level Heterogeneous Architecture and Boundary Awareness

被引:0
作者
Chen, Guojun [1 ]
Cui, Tao [1 ]
Hou, Yongjie [1 ]
Li, Huihui [1 ]
机构
[1] China Univ Petr East China, Qingdao Inst Software, Coll Comp Sci & Technol, Qingdao 266580, Peoples R China
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2024年 / 81卷 / 03期
关键词
Image segmentation; multi-level heterogeneous architecture; feature differences;
D O I
10.32604/cmc.2024.056093
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Existing glass segmentation networks have high computational complexity and large memory occupation, leading to high hardware requirements and time overheads for model inference, which is not conducive to efficiencyseeking real-time tasks such as autonomous driving. The inefficiency of the models is mainly due to employing homogeneous modules to process features of different layers. These modules require computationally intensive convolutions and weight calculation branches with numerous parameters to accommodate the differences in information across layers. We propose an efficient glass segmentation network (EGSNet) based on multi-level heterogeneous architecture and boundary awareness to balance the model performance and efficiency. EGSNet divides the feature layers from different stages into low-level understanding, semantic-level understanding, and global understanding with boundary guidance. Based on the information differences among the different layers, we further propose the multi-angle collaborative enhancement (MCE) module, which extracts the detailed information from shallow features, and the large-scale contextual feature extraction (LCFE) module to understand semantic logic through deep features. The models are trained and evaluated on the glass segmentation datasets HSO (Home-Scene-Oriented) and Trans10k-stuff, respectively, and EGSNet achieves the best efficiency and performance compared to advanced methods. In the HSO test set results, the IoU, F beta, MAE (Mean Absolute Error), and BER (Balance Error Rate) of EGSNet are 0.804, 0.847, 0.084, and 0.085, and the GFLOPs (Giga Floating Point Operations Per Second) are only 27.15. Experimental results show that EGSNet significantly improves the efficiency of the glass segmentation task with better performance.
引用
收藏
页码:3969 / 3987
页数:19
相关论文
共 43 条
  • [1] Mei H., Et al., Don’t hit me glass detection in real-world scenes, 2020 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 510-519, (2020)
  • [2] Xu C., Et al., NeRF-Det: Learning geometry-aware volumetric representation for multi-view 3D object detection, 2023 IEEE/CVF Int. Conf. Comput. Vis.(ICCV), pp. 23320-23330, (2023)
  • [3] Cao A. -Q., de Charette R., MonoScene: Monocular 3D semantic scene completion, 2022 IEEE/CVF Conf. Comput. Vis. Pattern Recognit.(CVPR), pp. 3981-3991, (2022)
  • [4] Chen G., Han K., Wong K.-Y. K., TOM-Net: Learning transparent object matting from a single image, 2018 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 9233-9241, (2018)
  • [5] Sun T., Zhang G., Yang W., Xue J. -H., Wang G., TROSD: A new rgb-d dataset for transparent and reflective object segmentation in practice, IEEE Trans. Circuits Syst. Video Technol, 33, 10, pp. 5721-5733, (2023)
  • [6] Xie E., Et al., Segmenting transparent objects in the wild, Proc. of the European Conf. Comput. Vis. (ECCV), pp. 696-711, (2020)
  • [7] Xie E., Et al., Segmenting transparent objects in the wild with transformer, Proc. Int. Joint Conf. Artif. Intell. (IJCAI), pp. 1194-1200, (2021)
  • [8] Mei H., Yang X., Yu L., Zhang Q., Wei X., Lau R. W. H., Large-field contextual feature learning for glass detection, IEEE Trans. Pattern Anal. Mach. Intell, 45, 3, pp. 3329-3346, (2023)
  • [9] Lin J., He Z., Lau R. W. H., Rich context aggregation with reflection prior for glass surface detection, 2021 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 13410-13419, (2021)
  • [10] Liu F., Liu Y., Lin J., Xu K., Lau R. W. H., Multi-view dynamic reflection prior for video glass surface detection, Proc. AAAI Conf. Artif. Intell, pp. 3594-3602, (2023)