An Empirical Study of Attention Networks for Semantic Segmentation

被引：0

作者：

Guo, Hao ^{[1
]}

Si, Hongbiao ^{[2
]}

Jiang, Guilin ^{[2
]}

Zhang, Wei ^{[3
]}

Liu, Zhiyan ^{[4
]}

Zhu, Xuanyi ^{[2
]}

Zhang, Xulong ^{[5
]}

Liu, Yang ^{[1
]}

机构：

[1] Hunan Chasing Secur Co Ltd, Changsha, Peoples R China

[2] Hunan Chasing Financial Holdings Co Ltd, Changsha, Peoples R China

[3] Hunan Chasing Digital Technol Co Ltd, Changsha, Peoples R China

[4] Hunan Chasing Trust Co Ltd, Changsha, Peoples R China

[5] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China

来源：

WEB AND BIG DATA, PT I, APWEB-WAIM 2023 | 2024年 / 14331卷

关键词：

Machine Learning; Deep Learnig; Semantic Segmentation; Attention;

D O I：

10.1007/978-981-97-2303-4_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semantic segmentation is a vital problem in computer vision. Recently, a common solution to semantic segmentation is the end-to-end convolution neural network, which is much more accurate than traditional methods. Recently, the decoders based on attention achieve state-of-the-art (SOTA) performance on various datasets. But these networks always are compared with the mIoU of previous SOTA networks to prove their superiority and ignore their characteristics without considering the computation complexity and precision in various categories, which is essential for engineering applications. Besides, the methods to analyze the FLOPs and memory are not consistent between different networks, which makes the comparison hard to be utilized. What's more, various methods utilize attention in semantic segmentation, but the conclusion of these methods is lacking. This paper first conducts experiments to analyze their computation complexity and compare their performance. Then it summarizes suitable scenes for these networks and concludes key points that should be concerned when constructing an attention network. Last it points out some future directions of the attention network.

引用

页码：222 / 235

页数：14

共 29 条

[1] Cheng B, 2021, ADV NEUR IN, V34
[2] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[3] Dual Attention Network for Scene Segmentation
Fu, Jun
Liu, Jing
Tian, Haijie
Li, Yong
Bao, Yongjun
Fang, Zhiwei
Lu, Hanqing
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3141 - 3149
[4] Attention mechanisms in computer vision: A survey
Guo, Meng-Hao
Xu, Tian-Xing
Liu, Jiang-Jiang
Liu, Zheng-Ning
Jiang, Peng-Tao
Mu, Tai-Jiang
Zhang, Song-Hai
Martin, Ralph R.
Cheng, Ming-Ming
Hu, Shi-Min
[J]. COMPUTATIONAL VISUAL MEDIA, 2022, 8 (03) : 331 - 368
[5] Eye movements in natural behavior
Hayhoe, M
Ballard, D
[J]. TRENDS IN COGNITIVE SCIENCES, 2005, 9 (04) : 188 - 194
[6] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[7] Hu XX, 2019, IEEE IMAGE PROC, P1440, DOI [10.1109/ICIP.2019.8803025, 10.1109/icip.2019.8803025]
[8] CCNet: Criss-Cross Attention for Semantic Segmentation
Huang, Zilong
Wang, Xinggang
Huang, Lichao
Huang, Chang
Wei, Yunchao
Liu, Wenyu
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 603 - 612
[9] A model of saliency-based visual attention for rapid scene analysis
Itti, L
Koch, C
Niebur, E
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (11) : 1254 - 1259
[10] Deep Hierarchical Semantic Segmentation
Li, Liulei
Zhou, Tianfei
Wang, Wenguan
Li, Jianwu
Yang, Yi
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1236 - 1247

← 1 2 3 →