A Lightweight Camouflaged Object Detection Model Based on Improved Attention Mechanism

被引：0

作者：

Song, Jinyu ^{[1
]}

Luo, Xianzhi ^{[1
]}

Jiang, Li ^{[2
]}

Zhang, Yan ^{[2
]}

Liu, Chun ^{[2
]}

机构：

[1] Hubei Univ, Sch Artificial Intelligence, Wuhan, Peoples R China

[2] Hubei Univ, Sch Comp Sci & Informat Engn, Wuhan, Peoples R China

来源：

COMPUTER ANIMATION AND SOCIAL AGENTS, CASA 2024, PT II | 2025年 / 2375卷

基金：

中国国家自然科学基金;

关键词：

computer vision; camouflaged object detection; attention mechanism; lightweight; transformer;

D O I：

10.1007/978-981-96-2684-7_12

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In computer vision tasks, camouflaged object detection aims to detect highly covert objects, which has important practical significance in military reconnaissance, medical monitoring and other fields. In recent years, camouflaged object detection models based on Transformer have become a new research hotspot. However, the training cost of classic Transformer models is high, and the dot-product attention mechanism has a quadratic computational complexity, which will cause excessive memory usage and limit their application on embedded edge devices with limited memory and computing resources. In view of the above issues, a lightweight camouflaged target detection model E-UGTR with improved attention mechanism is proposed. By introducing a linear complexity attention mechanism to reduce the computational complexity of the model, a flexible attention mechanism control strategy is adopted to enhance the performance compatibility of the model under different computing resource requirements. Then, based on the UGTR model, a universal linear attention module E-Attention is introduced to design and implement a lightweight adaptive camouflaged object detection model E-UGTR. The experimental results show that on the common public data set, the training speed of the E-UGTR model is about 1.8 times that of the UGTR model, and the inference speed is about 1.5 times that of the UGTR model. When compared with other classic SOTA models, the E-UGTR model has strong compatibility and can maintain good detection performance while being lightweight.

引用

页码：150 / 164

页数：15

共 32 条

[1]

Bolya D., 2022, EUR C COMP VIS, P35

[2] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[3] Hybrid Task Cascade for Instance Segmentation [J].

Chen, Kai ;

Pang, Jiangmiao ;

Wang, Jiaqi ;

Xiong, Yu ;

Li, Xiaoxiao ;

Sun, Shuyang ;

Feng, Wansen ;

Liu, Ziwei ;

Shi, Jianping ;

Ouyang, Wanli ;

Loy, Chen Change ;

Lin, Dahua .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978

[4]

Choromanski K., 2021, P INT C LEARN REPR, P1

[5]

Dosovitskiy A., 2021, ICLR

[6]

Fan D.-P., 2019, CVPR, P2774

[7]

Fan D.-P., 2021, Sci. Sin. Inform, V51, P1

[8] Structure-measure: A New Way to Evaluate Foreground Maps [J].

Fan, Deng-Ping ;

Cheng, Ming-Ming ;

Liu, Yun ;

Li, Tao ;

Borji, Ali .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567

[9]

He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

[10] Mask Scoring R-CNN [J].

Huang, Zhaojin ;

Huang, Lichao ;

Gong, Yongchao ;

Huang, Chang ;

Wang, Xinggang .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :6402-6411

← 1 2 3 4 →