Multi-scale semantic enhancement network for object detection

被引：3

作者：

Guo, Dongen ^{[1
]}

Wu, Zechen ^{[1
]}

Feng, Jiangfan ^{[2
]}

Zou, Tao ^{[2
]}

机构：

[1] Nanyang Inst Technol, Sch Comp & Software, 80 Changjiang Rd, Nanyang 473004, Henan, Peoples R China

[2] Chongqing Univ Posts & Telecommun, Chongqing Engn Res Ctr Spatial Big Data Intelligen, 2,Chongwen Rd, Chongqing 400065, Peoples R China

来源：

SCIENTIFIC REPORTS | 2023年 / 13卷 / 01期

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1038/s41598-023-34277-7

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

In the field of object detection, feature pyramid network (FPN) can effectively extract multi-scale information. However, the majority of FPN-based methods suffer from a semantic gap between features of various sizes before feature fusion, which can lead to feature maps with significant aliasing. In this paper, we present a novel multi-scale semantic enhancement feature pyramid network (MSE-FPN) which consists of three effective modules: semantic enhancement module, semantic injection module, and gated channel guidance module to alleviate these problems. Specifically, inspired by the strong ability of the self-attention mechanism to model context, we propose a semantic enhancement module to model global context to obtain the global semantic information before feature fusion. Then we propose the semantic injection module to divide and merge global semantic information into feature maps at various scales to narrow the semantic gap between features at different scales and efficiently utilize the semantic information of high-level features. Finally, to mitigate feature aliasing caused by feature fusion, the gated channel guidance module selectively outputs crucial features via a gating unit. By replacing FPN with MSE-FPN in Faster R-CNN, our models achieve 39.4 and 41.2 Average precision (AP) using ResNet50 and ResNet101 as the backbone network respectively. When using ResNet-101-64x4d as the backbone, MSE-FPN achieved up to 43.4 AP. Our results demonstrate that replacing FPN with MSE-FPN significantly enhances the detection performance of state-of-the-art FPN-based detectors.

引用

页数：11

共 50 条

[1] Multi-scale semantic enhancement network for object detection
Dongen Guo
Zechen Wu
Jiangfan Feng
Tao Zou
Scientific Reports, 13
[2] Multi-scale Context Enhancement Network for Object Detection
Wang, Yanan
Ma, Yingdong
2022 2ND IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE (SEAI 2022), 2022, : 6 - 11
[3] Feature Enhancement for Multi-scale Object Detection
Huicheng Zheng
Jiajie Chen
Lvran Chen
Ye Li
Zhiwei Yan
Neural Processing Letters, 2020, 51 : 1907 - 1919
[4] Feature Enhancement for Multi-scale Object Detection
Zheng, Huicheng
Chen, Jiajie
Chen, Lvran
Li, Ye
Yan, Zhiwei
NEURAL PROCESSING LETTERS, 2020, 51 (02) : 1907 - 1919
[5] Multi-scale Semantic Information Fusion for Object Detection
Chen Hongkun
Luo Huilan
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (07) : 2087 - 2095
[6] StairsNet: Mixed Multi-scale Network for Object Detection
Gao, Weiyi
Cao, Wenlong
Zhai, Jian
Rui, Jianwu
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I, 2018, 10735 : 303 - 314
[7] Multi-scale Interactive Network for Salient Object Detection
Pang, Youwei
Zhao, Xiaoqi
Zhang, Lihe
Lu, Huchuan
arXiv, 2020,
[8] Multi-Scale Cascade Network for Salient Object Detection
Li, Xin
Yang, Fan
Cheng, Hong
Chen, Junyu
Guo, Yuxiao
Chen, Leiting
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 439 - 447
[9] Lightweight multi-scale network for small object detection
Li, Li
Li, Bingxue
Zhou, Hongjuan
PEERJ COMPUTER SCIENCE, 2022, 8
[10] Lightweight multi-scale network for small object detection
Li L.
Li B.
Zhou H.
PeerJ Computer Science, 2022, 8

← 1 2 3 4 5 →