FSNet: Focus Scanning Network for Camouflaged Object Detection

被引：45

作者：

Song, Ze ^{[1
,2
]}

Kang, Xudong ^{[3
]}

Wei, Xiaohui ^{[1
,2
]}

Liu, Haibo ^{[3
]}

Dian, Renwei ^{[3
]}

Li, Shutao ^{[1
,2
]}

机构：

[1] Hunan Univ, Key Lab Visual Percept & Artificial Intelligence H, Changsha 410082, Peoples R China

[2] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Peoples R China

[3] Hunan Univ, Sch Robot, Changsha 410082, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2023年 / 32卷

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Transformers; Task analysis; Object detection; Image color analysis; Charge coupled devices; Image edge detection; Convolutional neural networks; Camouflaged object detection; swin transformer; SALIENT OBJECT; SEGMENTATION; EVOLUTION;

D O I：

10.1109/TIP.2023.3266659

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Camouflaged object detection (COD) aims to discover objects that blend in with the background due to similar colors or textures, etc. Existing deep learning methods do not systematically illustrate the key tasks in COD, which seriously hinders the improvement of its performance. In this paper, we introduce the concept of focus areas that represent some regions containing discernable colors or textures, and develop a two-stage focus scanning network for camouflaged object detection. Specifically, a novel encoder-decoder module is first designed to determine a region where the focus areas may appear. In this process, a multi-layer Swin transformer is deployed to encode global context information between the object and the background, and a novel cross-connection decoder is proposed to fuse cross-layer textures or semantics. Then, we utilize the multi-scale dilated convolution to obtain discriminative features with different scales in focus areas. Meanwhile, the dynamic difficulty aware loss is designed to guide the network paying more attention to structural details. Extensive experimental results on the benchmarks, including CAMO, CHAMELEON, COD10K, and NC4K, illustrate that the proposed method performs favorably against other state-of-the-art methods.

引用

页码：2267 / 2278

页数：12

共 64 条

[1]

[Anonymous], 2017, C COMP VIS PATT REC

[2]

Bao HB, 2020, PR MACH LEARN RES, V119

[3]

Bhajantri NU, 2006, ICIT 2006: 9TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY, PROCEEDINGS, P145

[4] What is a Salient Object? A Dataset and a Baseline Model for Salient Object Detection [J].

Borji, Ali .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (02) :742-756

[5] Camouflage Images [J].

Chu, Hung-Kuo ;

Hsu, Wei-Hsin ;

Mitra, Niloy J. ;

Cohen-Or, Daniel ;

Wong, Tien-Tsin ;

Lee, Tong-Yee .

ACM TRANSACTIONS ON GRAPHICS, 2010, 29 (04)

[6] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers [J].

Dai, Zhigang ;

Cai, Bolun ;

Lin, Yugeng ;

Chen, Junying .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :1601-1610

[7]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[8]

Deng-Ping Fan, 2020, Medical Image Computing and Computer Assisted Intervention - MICCAI 2020. 23rd International Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12266), P263, DOI 10.1007/978-3-030-59725-2_26

[9]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

[10]

Fan DP, 2018, Arxiv, DOI arXiv:1805.10421

← 1 2 3 4 5 6 7 →