Learning Shape-Biased Representations for Infrared Small Target Detection

被引：10

作者：

Lin, Fanzhao ^{[1
,2
]}

Ge, Shiming ^{[1
,2
]}

Bao, Kexin ^{[1
,2
]}

Yan, Chenggang ^{[3
]}

Zeng, Dan ^{[4
,5
]}

机构：

[1] Chinese Acad Sci, Inst Informat Engn, Beijing 100084, Peoples R China

[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing 100049, Peoples R China

[3] Hangzhou Dianzi Univ, Sch Commun Engn, Hangzhou, Peoples R China

[4] Hangzhou Dianzi Univ, Lishui Inst, Lishui 323000, Peoples R China

[5] Shanghai Univ, Dept Commun Engn, Shanghai 200040, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2024年 / 26卷

关键词：

Shape; Object detection; Feature extraction; Decoding; Kernel; Image reconstruction; Task analysis; Infrared small target detection; shape-biased representation; object segmentation; deep learning; FILTER; MODEL; DIM;

D O I：

10.1109/TMM.2023.3325743

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Typically, infrared small target detection aims to accurately localize objects from complex backgrounds where the object textures are often dim and the object shapes are varying. A feasible solution is learning discriminative representations with deep convolutional neural networks (CNNs). However, the representations learned by traditional deep CNNs often suffer from low shape bias. In this work, we propose a unified framework to learn shape-biased representations for facilitating infrared small target detection by explicitly incorporating shape information into model learning. The framework cascades a large-kernel encoder and a shape-guided decoder to learn discriminative shape-biased representations in an end-to-end manner. The large-kernel encoder describes infrared images into shape-preserving representations by using a few convolutions whose kernel size is as large as $9\times 9$, in contrast to commonly used $3\times 3$. The shape-guided decoder simultaneously addresses two tasks: decodes the encoder representations via upsampling reconstruction to reconstruct the segmentation, and hierarchically fuses the decoder representations and edge information via cascaded gated ResNet blocks to reconstruct the contour. In this way, the learned shape-biased representations are effective for identifying infrared small targets. Extensive experiments show our approach outperforms 18 state-of-the-arts.

引用

页码：4681 / 4692

页数：12

共 80 条

[1]

Asadi N, 2020, Arxiv, DOI arXiv:1909.08245

[2] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[3] Analysis of new top-hat transformation and the application for infrared dim small target detection [J].

Bai, Xiangzhi ;

Zhou, Fugen .

PATTERN RECOGNITION, 2010, 43 (06) :2145-2156

[4]

Barnett J., 1989, Proceedings of the SPIE - The International Society for Optical Engineering, V1050, P10

[5]

Brochu F, 2019, Arxiv, DOI arXiv:1907.12892

[6] Effective Strip Noise Removal for Low-Textured Infrared Images Based on 1-D Guided Filtering [J].

Cao, Yanpeng ;

Yang, Michael Ying ;

Tisse, Christel-Loic .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2016, 26 (12) :2176-2188

[7] A Local Contrast Method for Small Infrared Target Detection [J].

Chen, C. L. Philip ;

Li, Hong ;

Wei, Yantao ;

Xia, Tian ;

Tang, Yuan Yan .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2014, 52 (01) :574-581

[8] CaMap: Camera-based Map Manipulation on Mobile Devices [J].

Chen, Liang ;

Chen, Dongyi .

PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,

[9]

Chen LC, 2017, Arxiv, DOI [arXiv:1706.05587, 10.48550/arXiv.1706.05587]

[10] Infrared Action Detection in the Dark via Cross-Stream Attention Mechanism [J].

Chen, Xu ;

Gao, Chenqiang ;

Li, Chaoyu ;

Yang, Yi ;

Meng, Deyu .

IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 :288-300

← 1 2 3 4 5 6 7 8 →