Residual attention mechanism and weighted feature fusion for multi-scale object detection

被引:3
作者
Zhang, Jie [1 ]
Qi, Qiye [1 ]
Zhang, Huanlong [1 ]
Du, Qifan [1 ]
Wang, Fengxian [1 ]
Shi, Xiaoping [2 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Elect & Informat Engn, Dongfeng Rd, Zhengzhou 450002, Henan, Peoples R China
[2] Harbin Inst Technol, Harbin, Peoples R China
基金
美国国家科学基金会;
关键词
Deep learning; Object detection; Residual attention mechanism; Weighted feature fusion;
D O I
10.1007/s11042-023-14997-8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Object detection is one of the critical problems in computer vision research, which is also an essential basis for understanding high-level semantic information of images. To improve object detection performance, an improved YOLOv3 multi-scale object detection method is proposed in this article. Firstly, a residual attention module is introduced into the neck of YOLOv3, which includes the channel attention module, spatial attention module, and skip connection. The residual attention module is applied to the three layers of features obtained from the backbone, making the output feature focus on the channels and regions related to the object. Secondly, an additional weight is proposed to add to each input feature in the top-down feature fusion stage of YOLOv3, the size of which is determined by the degree of contribution of each input feature to the output features. The experimental results on KITTI, PASCAL VOC, and bird's nest datasets fully verify the effectiveness of the proposed method in object detection. The proposed method has significant value in electric power inspection and self-driving automobiles.
引用
收藏
页码:40873 / 40889
页数:17
相关论文
共 46 条
[41]   Compact CNN Based Video Representation for Efficient Video Copy Detection [J].
Wang, Ling ;
Bao, Yu ;
Li, Haojie ;
Fan, Xin ;
Luo, Zhongxuan .
MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 :576-587
[42]   Pixel-Wise Crowd Understanding via Synthetic Data [J].
Wang, Qi ;
Gao, Junyu ;
Lin, Wei ;
Yuan, Yuan .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (01) :225-245
[43]   CBAM: Convolutional Block Attention Module [J].
Woo, Sanghyun ;
Park, Jongchan ;
Lee, Joon-Young ;
Kweon, In So .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :3-19
[44]   A Method of Small Object Detection Based on Improved Deep Learning [J].
Yu, Changgeng ;
Liu, Kai ;
Zou, Wei .
OPTICAL MEMORY AND NEURAL NETWORKS, 2020, 29 (02) :69-76
[45]  
Yya B, 2020, Dig Signal Process, V102
[46]   Scale-Transferrable Object Detection [J].
Zhou, Peng ;
Ni, Bingbing ;
Geng, Cong ;
Hu, Jianguo ;
Xu, Yi .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :528-537