Improved Design Based on IoU Loss Functions for Bounding Box Regression

被引：3

作者：

Liu, Zhibo ^{[1
]}

Cheng, Jian ^{[1
]}

Wang, Qinzheng ^{[1
]}

Xian, Lihua ^{[1
]}

机构：

[1] Univ Sci & Technol China, Dept Automat, Hefei, Peoples R China

来源：

2022 IEEE 6TH ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC) | 2022年

关键词：

computer vision; object detection; loss function; deep learning;

D O I：

10.1109/IAEAC54830.2022.9929938

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In object detection, the bounding box regression loss calculation has a great influence on the positioning effect of object detection. At present, the common loss function is smooth L1 loss or Intersection over Union(IoU)-based loss function. For the problems of large training error and low training accuracy based on the IoU loss function, two improved versions of loss functions BaIoU and BhIoU are proposed. Among them, BaIoU combines the balanced L1 loss and the original IoU loss function; based on BaIoU, BhIoU increases the loss gradient of IoU by improving the form of IoU to improve the algorithm of IoU. The bounding box regression simulation experiment proves that BaIoU and BhIoU can effectively overcome the problems of slow convergence and large training error of the loss function based on IoU. Using the MS COCO dataset, training tests are conducted using a two-stage object detector and a single-stage object detector, and the test results prove that BaIoU and BhIoU can improve the performance of the object detector with better localization accuracy than the existing IoU-based loss functions.

引用

页码：452 / 458

页数：7

共 16 条

[1]

Cai Li, 2022, ARXIV

[2]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[3] The Pascal Visual Object Classes (VOC) Challenge [J].

Everingham, Mark ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338

[4] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[5]

He J., 2022, arXiv

[6]

He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/ICCV.2017.322, 10.1109/TPAMI.2018.2844175]

[7] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[8]

Hongyu Zhai, 2020, 2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), P1522, DOI 10.1109/ITAIC49862.2020.9339070

[9] Microsoft COCO: Common Objects in Context [J].

Lin, Tsung-Yi ;

Maire, Michael ;

Belongie, Serge ;

Hays, James ;

Perona, Pietro ;

Ramanan, Deva ;

Dollar, Piotr ;

Zitnick, C. Lawrence .

COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755

[10] Feature Pyramid Networks for Object Detection [J].

Lin, Tsung-Yi ;

Dollar, Piotr ;

Girshick, Ross ;

He, Kaiming ;

Hariharan, Bharath ;

Belongie, Serge .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :936-944

← 1 2 →