Traffic Sign Detection and Recognition Using Multi-Scale Fusion and Prime Sample Attention

被引：21

作者：

Cao, Jinghao ^{[1
]}

Zhang, Junju ^{[1
]}

Huang, Wei ^{[1
]}

机构：

[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210094, Peoples R China

来源：

IEEE ACCESS | 2021年 / 9卷

关键词：

Traffic sign detection; multi-scale; prime sample attention; features extract;

D O I：

10.1109/ACCESS.2020.3047414

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Traffic sign detection, though one of the key technologies in intelligent transportation, still has bottleneck in accuracy due to the small size and diversity of traffic signs. To solve this problem, we proposed a two-stage CNN object detection algorithm based on multi-scale feature fusion and prime sample attention. We improved the original Faster R-cnn model in terms of feature extraction and sampling strategy. For feature extraction, to elevate the ability of neural networks to detect small objects, we adopted HRNet as the feature extractor. There are four stages in HRNet - a series of high resolution subnets as the starting point with repeated adding parallel high to low resolution subnets to form other stages. In the whole process, the information in the parallel multi-resolution sub-network is repeatedly exchanged to perform repeated multi-scale fusion. For sampling strategy, we adopted a simple and effective sampling and learning strategy called Prime Sample Attention (PISA), consisting of Importance-based Sample Reweighting (ISR) and Classification Aware Regression Loss (CARL). PISA proposed the concepts of IoU Hierarchical Partial Sorting (IoU-HLR) and Hierarchical Partial Score Sorting (Score-HLR), which sort the importance of positive samples and negative samples in mini-batch respectively. With the proposed method, the training process is focusing on prime samples rather than evenly treat all ones. The algorithm complexity of our method is lower than that of other state-of-the-art. After experiments by TT100K dataset, our method can attain a comparable or even better detection accuracy and robustness.

引用

页码：3579 / 3591

页数：13

共 43 条

[1]

Alfarrarjeh A, 2018, IEEE INT CONF BIG DA, P5201, DOI 10.1109/BigData.2018.8621899

[2]

Anguelov D., 2016, P COMPUTER VISION EC, P21, DOI DOI 10.1007/978-3-319-46448-0_2

[3]

[Anonymous], 2018, ARXIV180810524

[4] SURF: Speeded up robust features [J].

Bay, Herbert ;

Tuytelaars, Tinne ;

Van Gool, Luc .

COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 :404-417

[5] Cascade R-CNN: Delving into High Quality Object Detection [J].

Cai, Zhaowei ;

Vasconcelos, Nuno .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162

[6] Prime Sample Attention in Object Detection [J].

Cao, Yuhang ;

Chen, Kai ;

Loy, Chen Change ;

Lin, Dahua .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11580-11588

[7]

Dai J, 2016, PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), P1796, DOI 10.1109/ICIT.2016.7475036

[8] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[9]

Dubey A. R., 2020, DETECTION CLASSIFFCA, V766, DOI [10.1007/978-981-13-9683-0_6, DOI 10.1007/978-981-13-9683-0_6]

[10] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

← 1 2 3 4 5 →