Att-U2Net: Using Attention to Enhance Semantic Representation for Salient Object Detection

被引：0

作者：

Jiang, Chenzhe ^{[1
]}

Xu, Banglian ^{[1
]}

Zheng, Qinghe ^{[2
]}

Li, Zhengtao ^{[3
]}

Zhang, Leihong ^{[4
,5
]}

Shen, Zimin ^{[1
]}

Sun, Quan ^{[6
]}

Zhang, Dawei ^{[4
]}

机构：

[1] Univ Shanghai Sci & Technol, Coll Publishing, Shanghai 200093, Peoples R China

[2] Shandong Management Univ, Sch Intelligent Engn, Jinan 250357, Peoples R China

[3] Shanghai Zhengfei Elect Technol CO LTD, Shanghai 200436, Peoples R China

[4] Univ Shanghai Sci & Technol, Sch Opt Elect & Comp Engn, Shanghai 200093, Peoples R China

[5] Zhejiang Univ, State Key Lab Extreme Photon & Instrumentat, Hangzhou 310058, Zhejiang, Peoples R China

[6] Natl Univ Def Technol, Coll Adv Interdisciplinary Studies, Changsha 410003, Peoples R China

来源：

IET SIGNAL PROCESSING | 2024年 / 2024卷

基金：

中国国家自然科学基金;

关键词：

attention mechanism; salient object detection; U-shaped structure; RECOGNITION;

D O I：

10.1049/sil2/6606572

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Saliency object detection has been widely used in computer vision tasks such as image understanding, semantic segmentation, and target tracking by mimicking the human visual perceptual system to find the most visually appealing object. The U2Net model has shown good performance in salient object detection (SOD) because of its unique U-shaped residual structure and the U-shaped structural backbone incorporating feature information of different scales. However, in the U-shaped structure, the global semantic information computed from the topmost layer may be gradually interfered by the large amount of local information dilution in the top-down path, and the U-shaped residual structure has insufficient attention to the features in the salient target region of the image and will pass redundant features to the next stage. To address these two shortcomings in the U2Net model, this paper proposes improvements in two aspects: to address the situation that the global semantic information is diluted by local semantic information and the residual U-block (RSU) module pays insufficient attention to the salient regions and redundant features. An attentional gating mechanism is added to filter redundant features in the U-structure backbone. A channel attention (CA) mechanism is introduced to capture important features in the RSU module. The experimental results prove that the method proposed in this paper has higher accuracy compared to the U2Net model.

引用

页数：12

共 57 条

[1]

Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596

[2]

[Anonymous], 2010, P IEEE COMP SOC C CO

[3] SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J].

Chen, Long ;

Zhang, Hanwang ;

Xiao, Jun ;

Nie, Liqiang ;

Shao, Jian ;

Liu, Wei ;

Chua, Tat-Seng .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6298-6306

[4] RepFinder: Finding Approximately Repeated Scene Elements for Image Editing [J].

Cheng, Ming-Ming ;

Zhang, Fang-Lue ;

Mitra, Niloy J. ;

Huang, Xiaolei ;

Hu, Shi-Min .

ACM TRANSACTIONS ON GRAPHICS, 2010, 29 (04)

[5] Global Contrast based Salient Region Detection [J].

Cheng, Ming-Ming ;

Zhang, Guo-Xin ;

Mitra, Niloy J. ;

Huang, Xiaolei ;

Hu, Shi-Min .

2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, :409-416

[6] Attentive Feedback Network for Boundary-Aware Salient Object Detection [J].

Feng, Mengyang ;

Lu, Huchuan ;

Ding, Errui .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :1623-1632

[7] Dual Attention Network for Scene Segmentation [J].

Fu, Jun ;

Liu, Jing ;

Tian, Haijie ;

Li, Yong ;

Bao, Yongjun ;

Fang, Zhiwei ;

Lu, Hanqing .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149

[8] Heterogeneous Fusion and Integrity Learning Network for RGB-D Salient Object Detection [J].

Gao, Haorao ;

Su, Yiming ;

Wang, Fasheng ;

Li, Haojie .

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (07)

[9] 3-D Object Retrieval and Recognition With Hypergraph Analysis [J].

Gao, Yue ;

Wang, Meng ;

Tao, Dacheng ;

Ji, Rongrong ;

Dai, Qionghai .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2012, 21 (09) :4290-4303

[10] Fully Dense UNet for 2-D Sparse Photoacoustic Tomography Artifact Removal [J].

Guan, Steven ;

Khan, Amir A. ;

Sikdar, Siddhartha ;

Chitnis, Parag V. .

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (02) :568-576

← 1 2 3 4 5 6 →