A Survey on Image Semantic Segmentation Using Deep Learning Techniques

被引：15

作者：

Cheng, Jieren ^{[1
,3
]}

Li, Hua ^{[2
]}

Li, Dengbo ^{[3
]}

Hua, Shuai ^{[2
]}

Sheng, Victor S. ^{[4
]}

机构：

[1] Hainan Univ, Sch Comp Sci & Technol, Haikou 570228, Peoples R China

[2] Hainan Univ, Sch Cyberspace Secur, Sch Cryptol, Haikou 570228, Peoples R China

[3] Hainan Univ, Hainan Blockchain Technol Engn Res Ctr, Haikou 570228, Peoples R China

[4] Texas Tech Univ, Dept Comp Sci, Lubbock, TX 79409 USA

来源：

CMC-COMPUTERS MATERIALS & CONTINUA | 2023年 / 74卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Deep learning; semantic segmentation; CNN; MLP; transformer; NETWORK;

D O I：

10.32604/cmc.2023.032757

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Image semantic segmentation is an important branch of computer vision of a wide variety of practical applications such as medical image analysis, autonomous driving, virtual or augmented reality, etc. In recent years, due to the remarkable performance of transformer and multilayer perceptron (MLP) in computer vision, which is equivalent to convolutional neural network (CNN), there has been a substantial amount of image semantic segmentation works aimed at developing different types of deep learning architecture. This survey aims to provide a comprehensive overview of deep learning methods in the field of general image semantic segmentation. Firstly, the commonly used image segmentation datasets are listed. Next, extensive pioneering works are deeply studied from multiple perspectives (e.g., network structures, feature fusion methods, attention mechanisms), and are divided into four categories according to different network architectures: CNN-based architectures, transformer-based architectures, MLP-based architectures, and others. Furthermore, this paper presents some common evaluation metrics and compares the respective advantages and limitations of popular techniques both in terms of architectural design and their experimental value on the most widely used datasets. Finally, possible future research directions and challenges are discussed for the reference of other researchers.

引用

页码：1941 / 1957

页数：17

共 77 条

[1] Effective Video Summarization Approach Based on Visual Attention [J].

Ahmad, Hilal ;

Khan, Habib Ullah ;

Ali, Sikandar ;

Rahman, Syed Ijaz Ur ;

Wahid, Fazli ;

Khattak, Hizbullah .

CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (01) :1427-1442

[2] MaskSplit: Self-supervised Meta-learning for Few-shot Semantic Segmentation [J].

Amac, Mustafa Sercan ;

Sencan, Ahmet ;

Baran, Orhun Bugra ;

Ikizler-Cinbis, Nazli ;

Cinbis, Ramazan Gokberk .

2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :428-438

[3]

Blaga BCZ, 2021, EUR SIGNAL PR CONF, P731, DOI 10.23919/EUSIPCO54536.2021.9616055

[4] Semantic object classes in video: A high-definition ground truth database [J].

Brostow, Gabriel J. ;

Fauqueur, Julien ;

Cipolla, Roberto .

PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97

[5] COCO-Stuff: Thing and Stuff Classes in Context [J].

Caesar, Holger ;

Uijlings, Jasper ;

Ferrari, Vittorio .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1209-1218

[6] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[7] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[8]

Chen SF, 2022, Arxiv, DOI arXiv:2107.10224

[9]

Chen Z, 2023, Arxiv, DOI arXiv:2205.08534

[10]

Cheng J., 2021, MIFNET LIGHTWEIGHT M, DOI [10.1002/int.22804, DOI 10.1002/INT.22804]

← 1 2 3 4 5 6 7 8 →