CSC-Unet: A Novel Convolutional Sparse Coding Strategy Based Neural Network for Semantic Segmentation

被引:11
作者
Tang, Haitong [1 ]
He, Shuang [1 ]
Yang, Mengduo [2 ]
Lu, Xia [3 ]
Yu, Qin [4 ]
Liu, Kaiyue [1 ]
Yan, Hongjie [5 ]
Wang, Nizhuan [1 ,6 ,7 ]
机构
[1] Jiangsu Ocean Univ, Sch Geomat & Marine Informat, Lianyungang 222005, Peoples R China
[2] Suzhou Inst Trade & Commerce, Sch Informat Technol, Suzhou 215009, Peoples R China
[3] Suzhou Univ Sci & Technol, Sch Geog Sci & Geomat Engn, Suzhou 215000, Peoples R China
[4] Jiangsu Ocean Univ, Sch Comp Engn, Lianyungang 222005, Peoples R China
[5] Xuzhou Med Univ, Affiliated Lianyungang Hosp, Dept Neurol, Lianyungang 222002, Peoples R China
[6] ShanghaiTech Univ, Sch Biomed Engn, Shanghai 201210, Peoples R China
[7] Hong Kong Polytech Univ, Dept Chinese & Bilingual Studies, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
U-Net; semantic segmentation; deep learning; convolution operation; convolutional sparse coding (CSC); ICA MODEL; REPRESENTATION;
D O I
10.1109/ACCESS.2024.3373619
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is still a challenging task to perform the semantic segmentation with high accuracy due to the complexity of real picture scenes. Many semantic segmentation methods based on traditional deep learning insufficiently captured the semantic and appearance information of images, which put limit on their generality and robustness for various application scenes. Thus, in this paper, we proposed a novel strategy that reformulated the popularly used convolution operation to multi-layer convolutional sparse coding block in semantic segmentation method to ease the aforementioned deficiency. To prove the effectiveness of our idea, we chose the widely used U-Net model for the demonstration purpose, and we designed CSC-Unet model series based on U-Net. Through extensive analysis and experiments, we provided credible evidence showing that the multi-layer convolutional sparse coding block enables semantic segmentation model to converge faster, extract finer semantic and appearance information of images, and improve the ability to recover spatial detail information. The best CSC-Unet model significantly outperforms the results of the original U-Net on three public datasets with different scenarios, i.e., 87.14% vs. 84.71% on DeepCrack dataset, 68.91% vs. 67.09% on Nuclei dataset, and 53.68% vs. 48.82% on CamVid dataset, respectively. In addition, the proposed strategy could be possibly used to significantly improve segmentation performance of any semantic segmentation model that involves convolution operations
引用
收藏
页码:35844 / 35854
页数:11
相关论文
共 59 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]  
Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[3]   Linear Convergence of Iterative Soft-Thresholding [J].
Bredies, Kristian ;
Lorenz, Dirk A. .
JOURNAL OF FOURIER ANALYSIS AND APPLICATIONS, 2008, 14 (5-6) :813-837
[4]   Segmentation and Recognition Using Structure from Motion Point Clouds [J].
Brostow, Gabriel J. ;
Shotton, Jamie ;
Fauqueur, Julien ;
Cipolla, Roberto .
COMPUTER VISION - ECCV 2008, PT I, PROCEEDINGS, 2008, 5302 :44-+
[5]   Robust uncertainty principles:: Exact signal reconstruction from highly incomplete frequency information [J].
Candès, EJ ;
Romberg, J ;
Tao, T .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (02) :489-509
[6]  
Chen LC, 2018, Arxiv, DOI arXiv:1802.02611
[7]  
Chen LC, 2016, Arxiv, DOI arXiv:1412.7062
[8]  
Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
[9]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[10]   Webly Supervised Learning of Convolutional Networks [J].
Chen, Xinlei ;
Gupta, Abhinav .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1431-1439