Multiscale anchor box and optimized classification with faster R-CNN for object detection

被引：5

作者：

Wang, Sheng-Ye ^{[1
]}

Qu, Zhong ^{[1
,2
]}

机构：

[1] Chongqing Univ Posts & Telecommun, Coll Comp Sci & Technol, Chongqing, Peoples R China

[2] Chongqing Univ Posts & Telecommun, Coll Comp Sci & Technol, 2 Chongwen Rd, Chongqing, Peoples R China

来源：

IET IMAGE PROCESSING | 2023年 / 17卷 / 05期

基金：

中国国家自然科学基金;

关键词：

image processing; image recognition; object detection; FEATURES;

D O I：

10.1049/ipr2.12714

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

For the two-stage object detector as a faster region-convolutional neural network (Faster R-CNN), upgrading the accuracy of object recognition depends on the proposal box, which is generated by the region proposal algorithms. Due to the limitations of the anchor setting of Faster RCNN, the size of the proposal box generated by the region proposal network (RPN) used is large, which would easily cause a great number of overflows in the sliding search. To improve the accuracy of object detection and remit the overflow problem of the anchor box, multi-scale anchor box and moving overflow anchor box strategies are introduced here. Then, to increase the positive sample range of the foreground, the hierarchical weight cross entropy classification function is set for binary classification in the RPN network. These strategies could improve the accuracy of object detection. The experimental result achieves 76.2% AP on the Pascal VOC 2007(VOC 07) dataset, which is 2.7% higher than the Faster R-CNN. The result of the Pascal VOC 2012(VOC 12) test, we achieve 75.6% AP, is improved by 2.5% compared with the Faster R-CNN.

引用

页码：1322 / 1333

页数：12

共 42 条

[1] Spectral-Spatial Scale Invariant Feature Transform for Hyperspectral Images [J].

Al-khafaji, Suhad Lateef ;

Zhou, Jun ;

Zia, Ali ;

Liew, Alan Wee-Chung .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (02) :837-850

[2] Neural Network Architecture for Cognitive Navigation in Dynamic Environments [J].

Antonio Villacorta-Atienza, Jose ;

Makarov, Valeri A. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (12) :2075-2087

[3] Cognizant Multitasking in Multiobjective Multifactorial Evolution: MO-MFEA-II [J].

Bali, Kavitesh Kumar ;

Gupta, Abhishek ;

Ong, Yew-Soon ;

Tan, Puay Siew .

IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (04) :1784-1796

[4] IEMask R-CNN: Information-Enhanced Mask R-CNN [J].

Bi, Xiuli ;

Hu, Jinwu ;

Xiao, Bin ;

Li, Weisheng ;

Gao, Xinbo .

IEEE TRANSACTIONS ON BIG DATA, 2023, 9 (02) :688-700

[5] Genetic Programming With a New Representation to Automatically Learn Features and Evolve Ensembles for Image Classification [J].

Bi, Ying ;

Xue, Bing ;

Zhang, Mengjie .

IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (04) :1769-1783

[6] Loss Max-Pooling for Semantic Image Segmentation [J].

Bulo, Samuel Rota ;

Neuhold, Gerhard ;

Kontschieder, Peter .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7082-7091

[7] Cascade R-CNN: High Quality Object Detection and Instance Segmentation [J].

Cai, Zhaowei ;

Vasconcelos, Nuno .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (05) :1483-1498

[8] Learning Multilayer Channel Features for Pedestrian Detection [J].

Cao, Jiale ;

Pang, Yanwei ;

Li, Xuelong .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) :3210-3220

[9] Polar Transformation on Image Features for Orientation-Invariant Representations [J].

Chen, Jinhui ;

Luo, Zhaojie ;

Zhang, Zhihong ;

Huang, Faliang ;

Ye, Zhiling ;

Takiguchi, Tetsuya ;

Hancock, Edwin R. .

IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (02) :300-313

[10] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

← 1 2 3 4 5 →