An approach to improve SSD through mask prediction of multi-scale feature maps

被引：6

作者：

Sun, Peng ^{[1
]}

Zhao, Yaqin ^{[2
]}

Zhu, Songhao ^{[1
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing, Peoples R China

[2] Nanjing Forestry Univ, Coll Mech & Elect Engn, Nanjing, Peoples R China

来源：

PATTERN ANALYSIS AND APPLICATIONS | 2021年 / 24卷 / 03期

关键词：

SSD; FPN; Softmax; Deep learning;

D O I：

10.1007/s10044-021-00993-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a novel single shot object detection network with a mask prediction branch. Our motivation is to enhance object detection features with semantic information extracted from deeper layers. The proposed mask prediction branch enriches important features in shallower layers with pixel-wise probability distribution of semantic information. Meanwhile, an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Our network improves the performance significantly over SSD and FSSD (Feature Fusion Single Shot Multi-box Detector) with just a little speed drop. In addition, we discuss the relationship between effective receptive fields and theoretical receptive fields on VGG16 backbone network. Comprehensive experimental results on PASCAL VOC 2007 demonstrate the effectiveness of the proposed method. We achieve a mAP of 79.8 with 300 x 300 input images (81.2 mAP by 512 x 512 inputs) at the speed of 58.4 FPS on a single Nvidia 1080Ti GPU. Experimental results demonstrate that the proposed network achieves a comparable performance with the state-of-the-arts.

引用

页码：1357 / 1366

页数：10

共 33 条

[1]

Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036

[2]

Fu C.Y., 2017, ARXIV

[3]

Girshick RB., 2013, IEEE C COMP VISION P, V2014, P580, DOI [DOI 10.1109/CVPR.2014.81, 10.1109/CVPR.2014.81]

[4]

He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]

[5] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[6]

He KM, 2014, LECT NOTES COMPUT SC, V8691, P346, DOI [arXiv:1406.4729, 10.1007/978-3-319-10578-9_23]

[7]

He X., 2019, Swarm and Evo

[8] MIXED CONVECTION IN A DRIVEN CAVITY WITH A STABLE VERTICAL TEMPERATURE-GRADIENT [J].

IWATSU, R ;

HYUN, JM ;

KUWAHARA, K .

INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER, 1993, 36 (06) :1601-1608

[9]

Jeong J., 2017, Procedings of the British Machine Vision Conference 2017, DOI DOI 10.5244/C.31.76

[10] CornerNet: Detecting Objects as Paired Keypoints [J].

Law, Hei ;

Deng, Jia .

COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :765-781

← 1 2 3 4 →