Mixed local channel attention for object detection

被引：198

作者：

Wan, Dahang

Lu, Rongsheng ^{[1
]}

Shen, Siyuan

Xu, Ting

Lang, Xianli

Ren, Zhijie

机构：

[1] Hefei Univ Technol, Sch Instrument Sci & Optoelect Engn, Hefei 230009, Peoples R China

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2023年 / 123卷

关键词：

Attention mechanism; Local channel attention; Object detection; Deep learning algorithm; Convolutional neural network; DATASET;

D O I：

10.1016/j.engappai.2023.106442

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Attention mechanism, one of the most extensively utilized components in computer vision, can assist neural networks in emphasizing significant elements and suppressing irrelevant ones. However, the vast majority of channel attention mechanisms only contain channel feature information and ignore spatial feature information, resulting in poor model representation effect or object detection performance, and the spatial attention modules were often complex and expensive. In order to strike a balance between performance and complexity, this paper proposes a lightweight Mixed Local Channel Attention (MLCA) module to improve the performance of the object detection network, and it can simultaneously incorporate both channel information and spatial information, as well as local information and global information to improve the expression effect of the network. On this basis, the MobileNet-Attention-YOLO(MAY) algorithm for comparing the performance of various attention modules is presented. On the Pascal VOC and SMID datasets, MLCA achieves a better balance between model representation efficacy, performance, and complexity than alternative attention techniques. Against the Squeeze-and-Excitation(SE) attention mechanism on the PASCAL VOC dataset and the Coordinate Attention(CA) method on the SIMD dataset, the mAP is enhanced by 1.0 % and 1.5 %, respectively.

引用

页数：15

共 87 条

[11] A lightweight vehicles detection network model based on YOLOv5 [J].

Dong, Xudong ;

Yan, Shuai ;

Duan, Chaoqun .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 113

[12] The PASCAL Visual Object Classes Challenge: A Retrospective [J].

Everingham, Mark ;

Eslami, S. M. Ali ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136

[13]

Fan XJ, 2020, Arxiv, DOI arXiv:2010.10604

[14]

Fu J, 2019, Arxiv, DOI [arXiv:1809.02983, 10.48550/arXiv.1809.02983]

[15]

Howard AG, 2017, Arxiv, DOI arXiv:1704.04861

[16]

Gao ZL, 2018, Arxiv, DOI arXiv:1811.12006

[17]

Ge Z, 2021, Arxiv, DOI [arXiv:2107.08430, 10.48550/arXiv.2107.08430, DOI 10.48550/ARXIV.2107.08430]

[18]

Roy AG, 2018, Arxiv, DOI arXiv:1808.08127

[19] Novel computer-aided lung cancer detection based on convolutional neural network-based and feature-based classifiers using metaheuristics [J].

Guo, Zhiqiang ;

Xu, Lina ;

Si, Yujuan ;

Razmjooy, Navid .

INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2021, 31 (04) :1954-1969

[20] Multisized Object Detection Using Spaceborne Optical Imagery [J].

Haroon, Muhammad ;

Shahzad, Muhammad ;

Fraz, Muhammad Moazam .

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 :3032-3046

← 1 2 3 4 5 6 7 8 9 →