Adaptive feature fusion with attention mechanism for multi-scale target detection

被引:32
|
作者
Ju, Moran [1 ,2 ,3 ,4 ,5 ]
Luo, Jiangning [6 ]
Wang, Zhongbo [1 ,2 ,3 ,4 ,5 ]
Luo, Haibo [1 ,2 ,4 ,5 ]
机构
[1] Chinese Acad Sci, Shenyang Inst Automat, Shenyang 110016, Liaoning, Peoples R China
[2] Chinese Acad Sci, Inst Robot & Intelligent Mfg, Shenyang 110016, Liaoning, Peoples R China
[3] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[4] Chinese Acad Sci, Key Lab Opt Elect Informat Proc, Shenyang 110016, Liaoning, Peoples R China
[5] Key Lab Image Understanding & Comp Vis, Shenyang 110016, Liaoning, Peoples R China
[6] McGill Univ, Montreal, PQ H3A 0G4, Canada
关键词
Deep learning; Target detection; Adaptive feature fusion; Attention mechanism; RECOGNITION;
D O I
10.1007/s00521-020-05150-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To detect the targets of different sizes, multi-scale output is used by target detectors such as YOLO V3 and DSSD. To improve the detection performance, YOLO V3 and DSSD perform feature fusion by combining two adjacent scales. However, the feature fusion only between the adjacent scales is not sufficient. It hasn't made advantage of the features at other scales. What is more, as a common operation for feature fusion, concatenating can't provide a mechanism to learn the importance and correlation of the features at different scales. In this paper, we propose adaptive feature fusion with attention mechanism (AFFAM) for multi-scale target detection. AFFAM utilizes pathway layer and subpixel convolution layer to resize the feature maps, which is helpful to learn better and complex feature mapping. In addition, AFFAM utilizes global attention mechanism and spatial position attention mechanism, respectively, to learn the correlation of the channel features and the importance of the spatial features at different scales adaptively. Finally, we combine AFFAM with YOLO V3 to build an efficient multi-scale target detector. The comparative experiments are conducted on PASCAL VOC dataset, KITTI dataset and Smart UVM dataset. Compared with the state-of-the-art target detectors, YOLO V3 with AFFAM achieved 84.34% mean average precision (mAP) at 19.9 FPS on PASCAL VOC dataset, 87.2% mAP at 21 FPS on KITTI dataset and 99.22% mAP at 20.6 FPS on Smart UVM dataset which outperforms other advanced target detectors.
引用
收藏
页码:2769 / 2781
页数:13
相关论文
共 50 条
  • [41] Multi-scale fire detection algorithm with adaptive attention
    Liang Y.
    Chen T.
    Zhang W.
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2024, 44 (01): : 91 - 101
  • [42] Multi-scale Convolutional Feature Fusion Network Based on Attention Mechanism for IoT Traffic Classification
    Liao, Niandong
    Guan, Jiayu
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [43] Daytime sea fog detection based on multi-scale feature fusion of generated adversarial network under attention mechanism
    Fang X.
    Jin W.
    Fu R.
    Li G.
    He C.
    Yi C.
    National Remote Sensing Bulletin, 2023, 27 (12) : 2736 - 2747
  • [44] A Multi-Feature Fusion and Attention Network for Multi-Scale Object Detection in Remote Sensing Images
    Cheng, Yong
    Wang, Wei
    Zhang, Wenjie
    Yang, Ling
    Wang, Jun
    Ni, Huan
    Guan, Tingzhao
    He, Jiaxin
    Gu, Yakang
    Tran, Ngoc Nguyen
    REMOTE SENSING, 2023, 15 (08)
  • [45] Multi-scale Adaptive Feature Fusion Hashing for Image Retrieval
    Jiang, Xiangkui
    Hu, Fei
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024,
  • [46] Medical personal protective equipment detection based on attention mechanism and multi-scale fusion
    Lou, Jianlou
    Li, Xiangyu
    Huo, Guang
    Liang, Feng
    Qu, Zhaoyang
    Lou, Tianrui
    Soleil, Ndagijimana Kwihangano
    INTERNATIONAL JOURNAL OF SENSOR NETWORKS, 2023, 41 (03) : 189 - 203
  • [47] Target detection algorithm based on multilayer attention mechanism-adaptive feature fusion network
    Fengping An
    Jianrong Wang
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 2685 - 2695
  • [48] Target detection algorithm based on multilayer attention mechanism-adaptive feature fusion network
    An, Fengping
    Wang, Jianrong
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (08) : 2685 - 2695
  • [49] Extraction of Agricultural Fields via DASFNet with Dual Attention Mechanism and Multi-scale Feature Fusion in South Xinjiang, China
    Lu, Rui
    Wang, Nan
    Zhang, Yanbin
    Lin, Yeneng
    Wu, Wenqiang
    Shi, Zhou
    REMOTE SENSING, 2022, 14 (09)
  • [50] Multi-scale hierarchical feature fusion network for change detection
    Zheng, Hanhong
    Zhang, Mingyang
    Gong, Maoguo
    Qin, A. K.
    Liu, Tongfei
    Jiang, Fenlong
    PATTERN RECOGNITION, 2025, 161